Students: Eran Kayat & Andrey Katunin
Who's a good dog? Who likes ear scratches? Well, it seems those fancy deep neural networks don't have all the answers. However, maybe they can answer that ubiquitous question we all ask when meeting a four-legged stranger: what kind of good pup is that?
Hello, we are Eran and Andrey and we are doing our project on dogs breed identification. For a little background, from a young age we both had a big love for two things: computers and our dogs. When we started working together, we thought this project would be a good opportunity to combine the two.
We found two datasets of images of dogs. Each image has a filename that is its unique id. The two datasets comprises 120 breeds of dogs. The goal of the project is to create a classifier capable of determining a dog's breed from a photo.
We think we can create an application that will allow you to take a picture and the app will tell you what kind of dogs it is.
We all now that feeling where see a dog and we dont now the specific breed.
Also some dogs are illegal having a model that recognizes dog breeds can help local authorities enforce the law
We downloaded 2 dog breed image data sets. One from kaggle and the other from Stanford and uploaded them to ourGoogle drive
#from google.colab import drive
# drive.mount('/content/drive')
We will now unzip our images
# import requests, zipfile, io
# z = zipfile.ZipFile('/content/drive/MyDrive/data/Images.zip')
# z.extractall()
# z = zipfile.ZipFile('/content/drive/MyDrive/data/dog-breed-identification.zip')
# z.extractall()
# functions we will later use to put our images in certain directories
import os, shutil
def mkdirIfNotExist(directory):
if not os.path.exists(directory):
os.mkdir(directory)
return directory
def copyIfNotExist(fnames, src_dir, dst_dir):
nCopied = 0
for fname in fnames:
src = fname
dst = os.path.join(dst_dir, fname.split("/")[1]) # 4 ON COLAB
if not os.path.exists(dst):
shutil.copyfile(src, dst)
nCopied += 1
if nCopied > 0:
print("Copied %d to %s" % (nCopied, dst_dir))
# # Path variables colab
# STANFORD_PATH = '/content/Images'
# KAGGLE_PATH = '/content/dog-breed-identification/train'
# KAGGLE_LABEL_PATH = '/content/dog-breed-identification/labels.csv'
# VALID_PATH = mkdirIfNotExist('/content/validtation')
# TRAIN_PATH = mkdirIfNotExist('/content/train')
# Path variables windows
STANFORD_PATH = r'C:\content\Images'
KAGGLE_PATH = r'C:\content\dog-breed-identification\train'
KAGGLE_LABEL_PATH = r'C:\content\dog-breed-identification\labels.csv'
VALID_PATH = mkdirIfNotExist(r'C:\content\validtation')
TRAIN_PATH = mkdirIfNotExist(r'C:\content\train')
Imports of the libraries we are going to use for our task
import pandas as pd
import webbrowser, os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tqdm import tqdm
import os
import random
import math
#import skimage.io
import sklearn
import cv2
from sklearn import metrics
from sklearn.metrics import confusion_matrix, classification_report
sns.set()
pd.set_option('display.expand_frame_repr', False)
Now our images are in 2 different directories and the image structure is different. We want to merge our datasets into one and continue from there
import warnings
warnings.filterwarnings('ignore')
We dont want to see any warnings
Lets start with the data set from kaggle and load it
df_labels_kaggle = pd.read_csv(KAGGLE_LABEL_PATH)
df_labels_kaggle
| id | breed | |
|---|---|---|
| 0 | 000bec180eb18c7604dcecc8fe0dba07 | boston_bull |
| 1 | 001513dfcb2ffafc82cccf4d8bbaba97 | dingo |
| 2 | 001cdf01b096e06d78e9e5112d419397 | pekinese |
| 3 | 00214f311d5d2247d5dfe4fe24b2303d | bluetick |
| 4 | 0021f9ceb3235effd7fcde7f7538ed62 | golden_retriever |
| ... | ... | ... |
| 10217 | ffd25009d635cfd16e793503ac5edef0 | borzoi |
| 10218 | ffd3f636f7f379c51ba3648a9ff8254f | dandie_dinmont |
| 10219 | ffe2ca6c940cddfee68fa3cc6c63213f | airedale |
| 10220 | ffe5f6d8e2bff356e9482a80a6e29aac | miniature_pinscher |
| 10221 | fff43b07992508bc822f33d8ffd902ae | chesapeake_bay_retriever |
10222 rows × 2 columns
We can see that in this data set we have 10221 rows and that each row contains the id of the image and the name of the dog breed
Now we will load our Stanford data set
import os
# Creating an empty dataset of id and breed to match the kaggle data set
df_stanford = pd.DataFrame(columns=['id','breed'])
list_of_names= []
# Iterate over a dir and add all the file names to a list
def listdir(dir):
temp_list = []
filenames = os.listdir(dir)
for files in filenames:
temp_list.append(files)
return temp_list
list_of_names = listdir(STANFORD_PATH)
# Add all the individual dog breed names to one data set
for name in list_of_names:
list_dog_in_breed = []
list_dog_in_breed = listdir(STANFORD_PATH + '/' + name)
clean_dog_name = name.split('-',1)[1].lower()
for dog_name in list_dog_in_breed:
df_stanford = df_stanford.append({'id': dog_name.split('.')[0],'breed': clean_dog_name}, ignore_index=True)
df_stanford
| id | breed | |
|---|---|---|
| 0 | n02085620_10074 | chihuahua |
| 1 | n02085620_10131 | chihuahua |
| 2 | n02085620_10621 | chihuahua |
| 3 | n02085620_1073 | chihuahua |
| 4 | n02085620_10976 | chihuahua |
| ... | ... | ... |
| 20575 | n02116738_9798 | african_hunting_dog |
| 20576 | n02116738_9818 | african_hunting_dog |
| 20577 | n02116738_9829 | african_hunting_dog |
| 20578 | n02116738_9844 | african_hunting_dog |
| 20579 | n02116738_9924 | african_hunting_dog |
20580 rows × 2 columns
We can see now that we created a dataframe from the Stanford directories.
We have 20579 entries that each has an id and dog breed name
Now we will merge the two datasets to one using an outer join
df = df_stanford.merge(df_labels_kaggle,how='outer')
df
| id | breed | |
|---|---|---|
| 0 | n02085620_10074 | chihuahua |
| 1 | n02085620_10131 | chihuahua |
| 2 | n02085620_10621 | chihuahua |
| 3 | n02085620_1073 | chihuahua |
| 4 | n02085620_10976 | chihuahua |
| ... | ... | ... |
| 30797 | ffd25009d635cfd16e793503ac5edef0 | borzoi |
| 30798 | ffd3f636f7f379c51ba3648a9ff8254f | dandie_dinmont |
| 30799 | ffe2ca6c940cddfee68fa3cc6c63213f | airedale |
| 30800 | ffe5f6d8e2bff356e9482a80a6e29aac | miniature_pinscher |
| 30801 | fff43b07992508bc822f33d8ffd902ae | chesapeake_bay_retriever |
30802 rows × 2 columns
After the merger we have 30802 rows
Lets see how many labels we have
labels_names=df["breed"].unique()
labels_sorted=labels_names.sort()
labels = dict(zip(range(len(labels_names)),labels_names))
labels
{0: 'affenpinscher',
1: 'afghan_hound',
2: 'african_hunting_dog',
3: 'airedale',
4: 'american_staffordshire_terrier',
5: 'appenzeller',
6: 'australian_terrier',
7: 'basenji',
8: 'basset',
9: 'beagle',
10: 'bedlington_terrier',
11: 'bernese_mountain_dog',
12: 'black-and-tan_coonhound',
13: 'blenheim_spaniel',
14: 'bloodhound',
15: 'bluetick',
16: 'border_collie',
17: 'border_terrier',
18: 'borzoi',
19: 'boston_bull',
20: 'bouvier_des_flandres',
21: 'boxer',
22: 'brabancon_griffon',
23: 'briard',
24: 'brittany_spaniel',
25: 'bull_mastiff',
26: 'cairn',
27: 'cardigan',
28: 'chesapeake_bay_retriever',
29: 'chihuahua',
30: 'chow',
31: 'clumber',
32: 'cocker_spaniel',
33: 'collie',
34: 'curly-coated_retriever',
35: 'dandie_dinmont',
36: 'dhole',
37: 'dingo',
38: 'doberman',
39: 'english_foxhound',
40: 'english_setter',
41: 'english_springer',
42: 'entlebucher',
43: 'eskimo_dog',
44: 'flat-coated_retriever',
45: 'french_bulldog',
46: 'german_shepherd',
47: 'german_short-haired_pointer',
48: 'giant_schnauzer',
49: 'golden_retriever',
50: 'gordon_setter',
51: 'great_dane',
52: 'great_pyrenees',
53: 'greater_swiss_mountain_dog',
54: 'groenendael',
55: 'ibizan_hound',
56: 'irish_setter',
57: 'irish_terrier',
58: 'irish_water_spaniel',
59: 'irish_wolfhound',
60: 'italian_greyhound',
61: 'japanese_spaniel',
62: 'keeshond',
63: 'kelpie',
64: 'kerry_blue_terrier',
65: 'komondor',
66: 'kuvasz',
67: 'labrador_retriever',
68: 'lakeland_terrier',
69: 'leonberg',
70: 'lhasa',
71: 'malamute',
72: 'malinois',
73: 'maltese_dog',
74: 'mexican_hairless',
75: 'miniature_pinscher',
76: 'miniature_poodle',
77: 'miniature_schnauzer',
78: 'newfoundland',
79: 'norfolk_terrier',
80: 'norwegian_elkhound',
81: 'norwich_terrier',
82: 'old_english_sheepdog',
83: 'otterhound',
84: 'papillon',
85: 'pekinese',
86: 'pembroke',
87: 'pomeranian',
88: 'pug',
89: 'redbone',
90: 'rhodesian_ridgeback',
91: 'rottweiler',
92: 'saint_bernard',
93: 'saluki',
94: 'samoyed',
95: 'schipperke',
96: 'scotch_terrier',
97: 'scottish_deerhound',
98: 'sealyham_terrier',
99: 'shetland_sheepdog',
100: 'shih-tzu',
101: 'siberian_husky',
102: 'silky_terrier',
103: 'soft-coated_wheaten_terrier',
104: 'staffordshire_bullterrier',
105: 'standard_poodle',
106: 'standard_schnauzer',
107: 'sussex_spaniel',
108: 'tibetan_mastiff',
109: 'tibetan_terrier',
110: 'toy_poodle',
111: 'toy_terrier',
112: 'vizsla',
113: 'walker_hound',
114: 'weimaraner',
115: 'welsh_springer_spaniel',
116: 'west_highland_white_terrier',
117: 'whippet',
118: 'wire-haired_fox_terrier',
119: 'yorkshire_terrier'}
We can see that we have 120 different dog breeds
Lets give each breed a label
lbl=[]
for i in range(len(df["breed"])):
temp=list(labels.values()).index(df.breed[i])
lbl.append(temp)
df['lbl'] = lbl
df
| id | breed | lbl | |
|---|---|---|---|
| 0 | n02085620_10074 | chihuahua | 29 |
| 1 | n02085620_10131 | chihuahua | 29 |
| 2 | n02085620_10621 | chihuahua | 29 |
| 3 | n02085620_1073 | chihuahua | 29 |
| 4 | n02085620_10976 | chihuahua | 29 |
| ... | ... | ... | ... |
| 30797 | ffd25009d635cfd16e793503ac5edef0 | borzoi | 18 |
| 30798 | ffd3f636f7f379c51ba3648a9ff8254f | dandie_dinmont | 35 |
| 30799 | ffe2ca6c940cddfee68fa3cc6c63213f | airedale | 3 |
| 30800 | ffe5f6d8e2bff356e9482a80a6e29aac | miniature_pinscher | 75 |
| 30801 | fff43b07992508bc822f33d8ffd902ae | chesapeake_bay_retriever | 28 |
30802 rows × 3 columns
Now for each dog in the data set lets add a path to the image
path_img=[]
for i in range(len(df["id"])):
temp=KAGGLE_PATH + "/" + str(df.id[i]) + ".jpg"
path_img.append(temp)
df['path_img'] =path_img
df.head()
| id | breed | lbl | path_img | |
|---|---|---|---|---|
| 0 | n02085620_10074 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 1 | n02085620_10131 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 2 | n02085620_10621 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 3 | n02085620_1073 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 4 | n02085620_10976 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
df
| id | breed | lbl | path_img | |
|---|---|---|---|---|
| 0 | n02085620_10074 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 1 | n02085620_10131 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 2 | n02085620_10621 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 3 | n02085620_1073 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| 4 | n02085620_10976 | chihuahua | 29 | C:\content\dog-breed-identification\train/n020... |
| ... | ... | ... | ... | ... |
| 30797 | ffd25009d635cfd16e793503ac5edef0 | borzoi | 18 | C:\content\dog-breed-identification\train/ffd2... |
| 30798 | ffd3f636f7f379c51ba3648a9ff8254f | dandie_dinmont | 35 | C:\content\dog-breed-identification\train/ffd3... |
| 30799 | ffe2ca6c940cddfee68fa3cc6c63213f | airedale | 3 | C:\content\dog-breed-identification\train/ffe2... |
| 30800 | ffe5f6d8e2bff356e9482a80a6e29aac | miniature_pinscher | 75 | C:\content\dog-breed-identification\train/ffe5... |
| 30801 | fff43b07992508bc822f33d8ffd902ae | chesapeake_bay_retriever | 28 | C:\content\dog-breed-identification\train/fff4... |
30802 rows × 4 columns
Lets look at what data we have
num_images = len(df["id"])
print('Number of images in Training file:', num_images)
num_labels=len(labels_names)
print('Number of dog breeds in Training file:', num_labels)
Number of images in Training file: 30802 Number of dog breeds in Training file: 120
We have 30802 images split amongst 120 categores
Now lets check the distribution.
Droping rows we dont have labeling for(152 images dropped)
count = 0
for pat in df['path_img']:
if not os.path.exists(pat):
df.drop(df[df.path_img == pat].index, inplace=True)
count = count+1
count
152
bar = df["breed"].value_counts(ascending=True).plot.barh(figsize = (30,120))
plt.title("Distribution of the Dog Breeds", fontsize = 20)
bar.tick_params(labelsize=16)
plt.show()
df["breed"].value_counts(ascending=False)
maltese_dog 369
scottish_deerhound 358
afghan_hound 355
bernese_mountain_dog 332
pomeranian 330
...
redbone 220
briard 218
golden_retriever 217
eskimo_dog 216
chihuahua 71
Name: breed, Length: 120, dtype: int64
We can see the some categories contain much less data than the other ones. This might affect the accuracy of the model. We will leave it for now and maybe look for more data later.
import PIL
def get_image_dim(image_id):
image = PIL.Image.open(f"{image_id}")
width,height = image.size
return (width,height)
tqdm.pandas()
df['image_dim'] = df['path_img'].progress_apply(get_image_dim)
df['height'] = df['image_dim'].apply(lambda x: x[0])
df['width'] = df['image_dim'].apply(lambda x: x[1])
fig, axis = plt.subplots(1, 2, figsize=(15,7))
sns.distplot(df['height'], ax=axis[0])
sns.distplot(df['width'], ax=axis[1])
plt.show()
100%|██████████████████████████████████████████████████████████████████████████| 30650/30650 [00:08<00:00, 3502.85it/s]
We can see that the resolution of the images is not to high and that the distrubution is around 400 for both sides
fig, axes = plt.subplots(nrows=4, ncols=5, figsize=(15, 15),
subplot_kw={'xticks': [], 'yticks': []})
for i, ax in enumerate(axes.flat,23000):
ax.imshow(plt.imread(df.path_img[i]))
ax.set_title(str(i) +" "+ df.breed[i])
plt.tight_layout()
plt.show()
%pylab inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
f = plt.figure()
f.add_subplot(1,2, 1)
plt.imshow(mpimg.imread(f"{STANFORD_PATH}/n02099712-Labrador_retriever/n02099712_2224.jpg"))
f.add_subplot(1,2, 2)
plt.imshow(mpimg.imread(f"{STANFORD_PATH}/n02099712-Labrador_retriever/n02099712_5000.jpg"))
plt.show(block=True)
Populating the interactive namespace from numpy and matplotlib
As we can see from the pictures above, these dogs are the same breed but the colors of the dogs and their positioning might confuse the algorithems to classify them correctly.
Lets take a look at two other images
f = plt.figure()
f.add_subplot(1,2, 1)
plt.imshow(mpimg.imread(f"{STANFORD_PATH}/n02110063-malamute/n02110063_609.jpg"))
f.add_subplot(1,2, 2)
plt.imshow(mpimg.imread(f"{STANFORD_PATH}/n02110185-Siberian_husky/n02110185_4677.jpg"))
plt.show(block=True)
This time, the pictures are of two diffrent breeds but they look very alike, so the algorithms might get the classification wrong and decide they are the same breed.
Now we will create a copy of this df to perform feature extraction
df_classification = df
Drop the id and breed as those are not relevant featrues
df_classification = df_classification.drop(columns = ["id","breed"])
df_classification.reset_index(inplace=True)
Now we will convert images to vector features using opencv and then try to classify the images using MultinomialNB
SIZE = 128 # Resizing images
new_df = pd.DataFrame()
for index, row in df_classification.iterrows(): # iterating over all the images
img = cv2.imread(row["path_img"], cv2.IMREAD_COLOR) #Reading color images
img = cv2.resize(img, (SIZE, SIZE)) #Resize images
x = img.reshape(-1)
new_df = new_df.append(pd.DataFrame(x).T,ignore_index=True)
new_df is a data frame with the pixels as the columns
Now that we have all the images as pixel verctor we can start using classification algoritms
from sklearn.utils import shuffle
new_df['lbl'] = df_classification['lbl']
new_df = shuffle(new_df)
new_df.reset_index(inplace=True)
X = new_df.drop(['index','lbl'], axis=1)
y = new_df['lbl']
Here we defined our features and labels (X,y)
Spliting the data into train and test
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
Because we dont have enough memory to train on the whole data we need use algos that support partial fit This why we decided to start with MultinomialNB.
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB()
Now we are iterating over our train data dividing the pixel values by 255 so they will be normalized and then partially fitting the data
classes = list(range(0, 119))
for i in range(len(X_train)//100):
x_partial = X_train[i*(len(X_train)//100) : (i+1)*(len(X_train)//100)]
y_partial = y_train[i*(len(y_train)//100) : (i+1)*(len(y_train)//100)]
if len(x_partial) == 0 :
break
x_partial = x_partial.div(255)
clf.partial_fit(x_partial, y_partial,classes = classes)
getting our predicitions
predicitions = clf.predict(X_test.div(255))
print ("Accuracy = ", metrics.accuracy_score(y_test, predicitions))
Accuracy = 0.04489103484275088
We can see that we got about 4.5% accuracy on test data, its very low and will not suite a production application
cm = confusion_matrix(y_test, predicitions)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
Idealy at 100% accuracy we would like to see the whole diognal in light color and the rest black for now we see a picture that is quite chaotic
Hopefuly with can improve in the future using more advanced techniques
We can see that we got 4 precent accuracy which is quite bad, now will try to improve the acuuracy by adding gabor features
Before we tried to use the whole dataset and we run into memory issues now we will only use the number of imgages in the smallest category
We have 216 images in the smallest category so now we will reduce the amount of images in all of the categories to 216
from sklearn.utils import shuffle
equal_df = shuffle(df)
equal_df = equal_df.groupby('lbl').head(216)
equal_df.reset_index(inplace=True)
Basically gabor filter analyze whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis.
No we will show an exampale of a gabor filter we can see that in the second picture we blur the whole image and only see the faces
ksize = 50 #Use size that makes sense to the image and fetaure size. Large may not be good.
#On the synthetic image it is clear how ksize affects imgae (try 5 and 50)
sigma = 3 #Large sigma on small features will fully miss the features.
theta = 1*np.pi/4 #/4 shows horizontal 3/4 shows other horizontal. Try other contributions
lamda = 1*np.pi /4 #1/4 works best for angled.
gamma=0.4 #Value of 1 defines spherical. Calue close to 0 has high aspect ratio
#Value of 1, spherical may not be ideal as it picks up features from other regions.
phi = 0 #Phase offset. I leave it to 0.
kernel = cv2.getGaborKernel((ksize, ksize), sigma, theta, lamda, gamma, phi, ktype=cv2.CV_32F)
plt.imshow(kernel)
img = cv2.imread(equal_df['path_img'][310])
fimg = cv2.filter2D(img, cv2.COLOR_RGB2BGR, kernel)
kernel_resized = cv2.resize(kernel, (128, 128)) # Resize image
plt.imshow(img)
plt.show()
plt.imshow(fimg)
plt.show()
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Now we will resize the images to size 64 by 64 because of memory restrictions and extract five sets of differnt gabor features.
SIZE = 64 # Resizing images
gabor_df = pd.DataFrame()
for index, row in equal_df.iterrows():
img = cv2.imread(row["path_img"], cv2.IMREAD_COLOR) #Reading color images
img = cv2.resize(img, (SIZE, SIZE)) #Resize images
temp_df = pd.DataFrame()
num = 1 #To count numbers up in order to give Gabor features a lable in the data frame
kernels = []
for theta in range(2): #Define number of thetas
theta = theta / 4. * np.pi
for sigma in (1, 3): #Sigma with 1 and 3
lamda = np.pi/4
gamma = 0.5
gabor_label = 'Gabor' + str(num) #Label Gabor columns as Gabor1, Gabor2, etc.
ksize=25
kernel = cv2.getGaborKernel((ksize, ksize), sigma, theta, lamda, gamma, 0, ktype=cv2.CV_32F)
kernels.append(kernel)
fimg = cv2.filter2D(img, cv2.CV_8UC3, kernel)
filtered_img = fimg.reshape(-1)
temp_df[gabor_label] = filtered_img #Labels columns as Gabor1, Gabor2, etc.
num += 1 #Increment for gabor column label
temp_df = temp_df.values.reshape(-1)
gabor_df = gabor_df.append(pd.DataFrame(temp_df).T,ignore_index=True)
gabor_df['lbl'] = equal_df['lbl']
X = gabor_df.drop('lbl',axis = 1)
y = equal_df['lbl']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
X_train = X_train.div(255.)
X_train.head()
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 49142 | 49143 | 49144 | 49145 | 49146 | 49147 | 49148 | 49149 | 49150 | 49151 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2237 | 0.223529 | 0.0 | 1.000000 | 0.000000 | 0.301961 | 0.0 | 1.00000 | 0.000000 | 0.501961 | 0.000000 | ... | 1.000000 | 1.000000 | 1.000000 | 1.000000 | 1.0 | 1.000000 | 1.000000 | 1.000000 | 1.0 | 1.000000 |
| 8900 | 0.262745 | 0.0 | 0.862745 | 0.713725 | 0.529412 | 0.0 | 1.00000 | 0.858824 | 0.309804 | 0.011765 | ... | 1.000000 | 0.000000 | 1.000000 | 0.305882 | 1.0 | 1.000000 | 0.956863 | 0.003922 | 1.0 | 0.647059 |
| 21961 | 0.203922 | 0.0 | 0.725490 | 0.000000 | 0.203922 | 0.0 | 0.72549 | 0.000000 | 0.203922 | 0.000000 | ... | 0.415686 | 0.172549 | 0.792157 | 0.392157 | 1.0 | 0.439216 | 0.988235 | 0.529412 | 1.0 | 0.831373 |
| 7793 | 0.274510 | 0.0 | 1.000000 | 0.682353 | 0.396078 | 0.0 | 1.00000 | 1.000000 | 0.435294 | 0.000000 | ... | 0.682353 | 0.274510 | 0.415686 | 0.054902 | 1.0 | 0.643137 | 0.768627 | 0.000000 | 1.0 | 1.000000 |
| 11832 | 0.835294 | 0.0 | 1.000000 | 1.000000 | 0.909804 | 0.0 | 1.00000 | 1.000000 | 0.976471 | 0.000000 | ... | 1.000000 | 0.447059 | 0.666667 | 0.219608 | 1.0 | 0.431373 | 0.600000 | 0.062745 | 1.0 | 0.396078 |
5 rows × 49152 columns
We will try to use support vector machines on our data for classification.

from sklearn import svm
SVM_model = svm.SVC(decision_function_shape='ovo') #For multiclass classification
SVM_model.fit(X_train, y_train)
SVC(decision_function_shape='ovo')
X_test = X_test.div(255.)
X_test.head()
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | ... | 49142 | 49143 | 49144 | 49145 | 49146 | 49147 | 49148 | 49149 | 49150 | 49151 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10296 | 1.000000 | 0.015686 | 1.000000 | 1.000000 | 1.000000 | 0.039216 | 1.0 | 1.000000 | 1.000000 | 0.035294 | ... | 1.0 | 1.00000 | 1.000000 | 0.062745 | 1.0 | 1.0 | 1.000000 | 0.149020 | 1.0 | 1.0 |
| 8362 | 0.792157 | 0.000000 | 1.000000 | 0.784314 | 0.945098 | 0.000000 | 1.0 | 0.980392 | 1.000000 | 0.000000 | ... | 1.0 | 1.00000 | 1.000000 | 0.000000 | 1.0 | 1.0 | 1.000000 | 0.000000 | 1.0 | 1.0 |
| 6223 | 0.325490 | 0.121569 | 0.921569 | 0.137255 | 0.427451 | 0.239216 | 1.0 | 0.321569 | 0.592157 | 0.188235 | ... | 1.0 | 0.00000 | 0.658824 | 0.000000 | 1.0 | 0.0 | 1.000000 | 0.000000 | 1.0 | 0.0 |
| 8843 | 0.482353 | 1.000000 | 1.000000 | 0.000000 | 0.400000 | 1.000000 | 1.0 | 0.141176 | 0.529412 | 1.000000 | ... | 1.0 | 0.87451 | 0.403922 | 0.482353 | 1.0 | 1.0 | 0.541176 | 0.435294 | 1.0 | 1.0 |
| 13487 | 1.000000 | 0.301961 | 1.000000 | 1.000000 | 0.745098 | 0.070588 | 1.0 | 0.717647 | 0.200000 | 0.078431 | ... | 1.0 | 1.00000 | 0.913725 | 0.078431 | 1.0 | 1.0 | 0.650980 | 0.192157 | 1.0 | 1.0 |
5 rows × 49152 columns
predicitions = SVM_model.predict(X_test)
from sklearn import metrics
print ("Accuracy = ", metrics.accuracy_score(y_test, predicitions))
Accuracy = 0.2462756052141527
We got 25% accuracy which is better than for but still not enough and we think we can do much better
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, predicitions)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
Here we can see that a diognal is starting to form but still it is not enough for us.
We will also try with a Random forest

from sklearn.ensemble import RandomForestClassifier
RF_model = RandomForestClassifier(n_estimators = 50, random_state = 42)
RF_model.fit(X_train, y_train)
RandomForestClassifier(n_estimators=50, random_state=42)
predictions_rf = RF_model.predict(X_test)
print ("Accuracy = ", metrics.accuracy_score(y_test, predictions_rf))
Accuracy = 0.44661700806952204
We got 44% accuracy which is even better than the svm approach
cm = confusion_matrix(y_test, predictions_rf)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
We will now use a pretrained VGG-16 NN on our images and extract the last layers as features and the use ML aloritms on the extracted features

Extracting features with VGG-16
from keras.preprocessing import image
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
model_vgg16 = VGG16(weights='imagenet', include_top=False)
model_vgg16.summary()
vgg16_feature_list = []
for index, row in df.iterrows():
img_path = row['path_img']
img = image.load_img(img_path, target_size=(224, 224))
img_data = image.img_to_array(img)
img_data = np.expand_dims(img_data, axis=0)
img_data = preprocess_input(img_data)
vgg16_feature = model_vgg16.predict(img_data)
vgg16_feature_np = np.array(vgg16_feature)
vgg16_feature_list.append(vgg16_feature_np.flatten())
vgg16_feature_list_np = np.array(vgg16_feature_list)
Model: "vgg16" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_1 (InputLayer) [(None, None, None, 3)] 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, None, None, 512) 0 ================================================================= Total params: 14,714,688 Trainable params: 14,714,688 Non-trainable params: 0 _________________________________________________________________
y_vgg = df['lbl']
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(vgg16_feature_list_np, y_vgg, test_size=0.25, random_state=42)
from sklearn.ensemble import RandomForestClassifier
RF_model_vgg = RandomForestClassifier(n_estimators = 100, random_state = 42)
RF_model_vgg.fit(X_train, y_train)
RandomForestClassifier(random_state=42)
predictions_rf_vgg = RF_model_vgg.predict(X_test)
print ("Accuracy = ", metrics.accuracy_score(y_test, predictions_rf_vgg))
Accuracy = 0.7320892600809082
Wow with VGG_16 + radom forest with 100 estimators we got 73 precent
cm = confusion_matrix(y_test, predictions_rf_vgg)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
Looks much better with each run we are getting better and better results the diognal looks much more defined
Lets also see how we are doing per category
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
cm.diagonal()
i = 0
for result in cm.diagonal():
print(str(i) +'. ' 'Acuracy for: ' + str(labels[i]) +'=' +str(result))
i+=1
0. Acuracy for: affenpinscher=0.8484848484848485 1. Acuracy for: afghan_hound=0.84375 2. Acuracy for: african_hunting_dog=0.9310344827586207 3. Acuracy for: airedale=0.7727272727272727 4. Acuracy for: american_staffordshire_terrier=0.5076923076923077 5. Acuracy for: appenzeller=0.6825396825396826 6. Acuracy for: australian_terrier=0.8072289156626506 7. Acuracy for: basenji=0.75 8. Acuracy for: basset=0.6727272727272727 9. Acuracy for: beagle=0.7692307692307693 10. Acuracy for: bedlington_terrier=0.9076923076923077 11. Acuracy for: bernese_mountain_dog=0.8541666666666666 12. Acuracy for: black-and-tan_coonhound=0.8285714285714286 13. Acuracy for: blenheim_spaniel=0.9027777777777778 14. Acuracy for: bloodhound=0.8133333333333334 15. Acuracy for: bluetick=0.7368421052631579 16. Acuracy for: border_collie=0.6909090909090909 17. Acuracy for: border_terrier=0.6911764705882353 18. Acuracy for: borzoi=0.6037735849056604 19. Acuracy for: boston_bull=0.9016393442622951 20. Acuracy for: bouvier_des_flandres=0.6774193548387096 21. Acuracy for: boxer=0.56 22. Acuracy for: brabancon_griffon=0.75 23. Acuracy for: briard=0.58 24. Acuracy for: brittany_spaniel=0.828125 25. Acuracy for: bull_mastiff=0.8032786885245902 26. Acuracy for: cairn=0.6301369863013698 27. Acuracy for: cardigan=0.6166666666666667 28. Acuracy for: chesapeake_bay_retriever=0.6792452830188679 29. Acuracy for: chihuahua=0.0 30. Acuracy for: chow=0.8571428571428571 31. Acuracy for: clumber=0.7966101694915254 32. Acuracy for: cocker_spaniel=0.6 33. Acuracy for: collie=0.6363636363636364 34. Acuracy for: curly-coated_retriever=0.8461538461538461 35. Acuracy for: dandie_dinmont=0.7763157894736842 36. Acuracy for: dhole=0.8913043478260869 37. Acuracy for: dingo=0.7192982456140351 38. Acuracy for: doberman=0.7727272727272727 39. Acuracy for: english_foxhound=0.7166666666666667 40. Acuracy for: english_setter=0.6363636363636364 41. Acuracy for: english_springer=0.7419354838709677 42. Acuracy for: entlebucher=0.8690476190476191 43. Acuracy for: eskimo_dog=0.48 44. Acuracy for: flat-coated_retriever=0.6379310344827587 45. Acuracy for: french_bulldog=0.631578947368421 46. Acuracy for: german_shepherd=0.5714285714285714 47. Acuracy for: german_short-haired_pointer=0.8032786885245902 48. Acuracy for: giant_schnauzer=0.6764705882352942 49. Acuracy for: golden_retriever=0.625 50. Acuracy for: gordon_setter=0.7428571428571429 51. Acuracy for: great_dane=0.6153846153846154 52. Acuracy for: great_pyrenees=0.8157894736842105 53. Acuracy for: greater_swiss_mountain_dog=0.6268656716417911 54. Acuracy for: groenendael=0.8591549295774648 55. Acuracy for: ibizan_hound=0.8545454545454545 56. Acuracy for: irish_setter=0.84375 57. Acuracy for: irish_terrier=0.7058823529411765 58. Acuracy for: irish_water_spaniel=0.7966101694915254 59. Acuracy for: irish_wolfhound=0.6296296296296297 60. Acuracy for: italian_greyhound=0.6376811594202898 61. Acuracy for: japanese_spaniel=0.7681159420289855 62. Acuracy for: keeshond=0.8666666666666667 63. Acuracy for: kelpie=0.5735294117647058 64. Acuracy for: kerry_blue_terrier=0.765625 65. Acuracy for: komondor=0.873015873015873 66. Acuracy for: kuvasz=0.5666666666666667 67. Acuracy for: labrador_retriever=0.5970149253731343 68. Acuracy for: lakeland_terrier=0.631578947368421 69. Acuracy for: leonberg=0.8488372093023255 70. Acuracy for: lhasa=0.6764705882352942 71. Acuracy for: malamute=0.625 72. Acuracy for: malinois=0.5957446808510638 73. Acuracy for: maltese_dog=0.9032258064516129 74. Acuracy for: mexican_hairless=0.8958333333333334 75. Acuracy for: miniature_pinscher=0.8333333333333334 76. Acuracy for: miniature_poodle=0.4576271186440678 77. Acuracy for: miniature_schnauzer=0.5833333333333334 78. Acuracy for: newfoundland=0.7088607594936709 79. Acuracy for: norfolk_terrier=0.6290322580645161 80. Acuracy for: norwegian_elkhound=0.8735632183908046 81. Acuracy for: norwich_terrier=0.5135135135135135 82. Acuracy for: old_english_sheepdog=0.7543859649122807 83. Acuracy for: otterhound=0.8679245283018868 84. Acuracy for: papillon=0.875 85. Acuracy for: pekinese=0.6612903225806451 86. Acuracy for: pembroke=0.7910447761194029 87. Acuracy for: pomeranian=0.8513513513513513 88. Acuracy for: pug=0.7681159420289855 89. Acuracy for: redbone=0.6037735849056604 90. Acuracy for: rhodesian_ridgeback=0.6911764705882353 91. Acuracy for: rottweiler=0.7678571428571429 92. Acuracy for: saint_bernard=0.8641975308641975 93. Acuracy for: saluki=0.6133333333333333 94. Acuracy for: samoyed=0.8837209302325582 95. Acuracy for: schipperke=0.8985507246376812 96. Acuracy for: scotch_terrier=0.85 97. Acuracy for: scottish_deerhound=0.8850574712643678 98. Acuracy for: sealyham_terrier=0.8225806451612904 99. Acuracy for: shetland_sheepdog=0.6444444444444445 100. Acuracy for: shih-tzu=0.6756756756756757 101. Acuracy for: siberian_husky=0.746031746031746 102. Acuracy for: silky_terrier=0.7468354430379747 103. Acuracy for: soft-coated_wheaten_terrier=0.6417910447761194 104. Acuracy for: staffordshire_bullterrier=0.7868852459016393 105. Acuracy for: standard_poodle=0.46875 106. Acuracy for: standard_schnauzer=0.4918032786885246 107. Acuracy for: sussex_spaniel=0.8305084745762712 108. Acuracy for: tibetan_mastiff=0.660377358490566 109. Acuracy for: tibetan_terrier=0.7391304347826086 110. Acuracy for: toy_poodle=0.6166666666666667 111. Acuracy for: toy_terrier=0.7230769230769231 112. Acuracy for: vizsla=0.873015873015873 113. Acuracy for: walker_hound=0.603448275862069 114. Acuracy for: weimaraner=0.7894736842105263 115. Acuracy for: welsh_springer_spaniel=0.6363636363636364 116. Acuracy for: west_highland_white_terrier=0.7540983606557377 117. Acuracy for: whippet=0.5733333333333334 118. Acuracy for: wire-haired_fox_terrier=0.6428571428571429 119. Acuracy for: yorkshire_terrier=0.647887323943662
We will also try with 200 and
RF_model_vgg_200 = RandomForestClassifier(n_estimators = 200, random_state = 42)
RF_model_vgg_200.fit(X_train, y_train)
RandomForestClassifier(n_estimators=200, random_state=42)
predictions_rf_vgg_200 = RF_model_vgg_200.predict(X_test)
print ("Accuracy = ", metrics.accuracy_score(y_test, predictions_rf_vgg_200))
Accuracy = 0.7568837269998695
cm = confusion_matrix(y_test, predictions_rf_vgg_200)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
RF_model_vgg_400 = RandomForestClassifier(n_estimators = 400, random_state = 42)
RF_model_vgg_400.fit(X_train, y_train)
RandomForestClassifier(n_estimators=400, random_state=42)
predictions_rf_vgg_400 = RF_model_vgg_400.predict(X_test)
y_test
print ("Accuracy = ", metrics.accuracy_score(y_test, predictions_rf_vgg_400))
Accuracy = 0.7754143285919353
We can see that with more estimators we are doing much better but the more we increas the amount of estimators the amount of imporvement is less and less
cm_400 = confusion_matrix(y_test, predictions_rf_vgg_400)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm_400, annot=True, ax=ax)
plt.show()
#Now the normalize the diagonal entries
cm_400 = cm_400.astype('float') / cm_400.sum(axis=1)[:, np.newaxis]
cm_400.diagonal()
i = 0
for result in cm_400.diagonal():
print(str(i) +'. ' 'Acuracy for: ' + str(labels[i]) +'=' +str(result))
i+=1
0. Acuracy for: affenpinscher=0.9090909090909091 1. Acuracy for: afghan_hound=0.8854166666666666 2. Acuracy for: african_hunting_dog=0.9310344827586207 3. Acuracy for: airedale=0.7878787878787878 4. Acuracy for: american_staffordshire_terrier=0.5076923076923077 5. Acuracy for: appenzeller=0.6666666666666666 6. Acuracy for: australian_terrier=0.8313253012048193 7. Acuracy for: basenji=0.75 8. Acuracy for: basset=0.6909090909090909 9. Acuracy for: beagle=0.8153846153846154 10. Acuracy for: bedlington_terrier=0.9384615384615385 11. Acuracy for: bernese_mountain_dog=0.8854166666666666 12. Acuracy for: black-and-tan_coonhound=0.8571428571428571 13. Acuracy for: blenheim_spaniel=0.8888888888888888 14. Acuracy for: bloodhound=0.8666666666666667 15. Acuracy for: bluetick=0.7543859649122807 16. Acuracy for: border_collie=0.7454545454545455 17. Acuracy for: border_terrier=0.75 18. Acuracy for: borzoi=0.6037735849056604 19. Acuracy for: boston_bull=0.9344262295081968 20. Acuracy for: bouvier_des_flandres=0.7580645161290323 21. Acuracy for: boxer=0.56 22. Acuracy for: brabancon_griffon=0.875 23. Acuracy for: briard=0.7 24. Acuracy for: brittany_spaniel=0.84375 25. Acuracy for: bull_mastiff=0.8360655737704918 26. Acuracy for: cairn=0.6301369863013698 27. Acuracy for: cardigan=0.65 28. Acuracy for: chesapeake_bay_retriever=0.7924528301886793 29. Acuracy for: chihuahua=0.0 30. Acuracy for: chow=0.8714285714285714 31. Acuracy for: clumber=0.8305084745762712 32. Acuracy for: cocker_spaniel=0.6909090909090909 33. Acuracy for: collie=0.6060606060606061 34. Acuracy for: curly-coated_retriever=0.8717948717948718 35. Acuracy for: dandie_dinmont=0.8157894736842105 36. Acuracy for: dhole=0.9130434782608695 37. Acuracy for: dingo=0.7368421052631579 38. Acuracy for: doberman=0.7272727272727273 39. Acuracy for: english_foxhound=0.7333333333333333 40. Acuracy for: english_setter=0.7090909090909091 41. Acuracy for: english_springer=0.8225806451612904 42. Acuracy for: entlebucher=0.9047619047619048 43. Acuracy for: eskimo_dog=0.46 44. Acuracy for: flat-coated_retriever=0.7586206896551724 45. Acuracy for: french_bulldog=0.6666666666666666 46. Acuracy for: german_shepherd=0.6904761904761905 47. Acuracy for: german_short-haired_pointer=0.8688524590163934 48. Acuracy for: giant_schnauzer=0.7205882352941176 49. Acuracy for: golden_retriever=0.6875 50. Acuracy for: gordon_setter=0.8714285714285714 51. Acuracy for: great_dane=0.5769230769230769 52. Acuracy for: great_pyrenees=0.9210526315789473 53. Acuracy for: greater_swiss_mountain_dog=0.6865671641791045 54. Acuracy for: groenendael=0.8309859154929577 55. Acuracy for: ibizan_hound=0.9272727272727272 56. Acuracy for: irish_setter=0.875 57. Acuracy for: irish_terrier=0.7843137254901961 58. Acuracy for: irish_water_spaniel=0.864406779661017 59. Acuracy for: irish_wolfhound=0.6296296296296297 60. Acuracy for: italian_greyhound=0.6956521739130435 61. Acuracy for: japanese_spaniel=0.8695652173913043 62. Acuracy for: keeshond=0.9 63. Acuracy for: kelpie=0.5882352941176471 64. Acuracy for: kerry_blue_terrier=0.84375 65. Acuracy for: komondor=0.9365079365079365 66. Acuracy for: kuvasz=0.5833333333333334 67. Acuracy for: labrador_retriever=0.6865671641791045 68. Acuracy for: lakeland_terrier=0.6842105263157895 69. Acuracy for: leonberg=0.9069767441860465 70. Acuracy for: lhasa=0.7058823529411765 71. Acuracy for: malamute=0.671875 72. Acuracy for: malinois=0.723404255319149 73. Acuracy for: maltese_dog=0.9354838709677419 74. Acuracy for: mexican_hairless=0.875 75. Acuracy for: miniature_pinscher=0.8717948717948718 76. Acuracy for: miniature_poodle=0.4915254237288136 77. Acuracy for: miniature_schnauzer=0.6333333333333333 78. Acuracy for: newfoundland=0.7468354430379747 79. Acuracy for: norfolk_terrier=0.7096774193548387 80. Acuracy for: norwegian_elkhound=0.8620689655172413 81. Acuracy for: norwich_terrier=0.6216216216216216 82. Acuracy for: old_english_sheepdog=0.7894736842105263 83. Acuracy for: otterhound=0.9056603773584906 84. Acuracy for: papillon=0.875 85. Acuracy for: pekinese=0.6935483870967742 86. Acuracy for: pembroke=0.8507462686567164 87. Acuracy for: pomeranian=0.8783783783783784 88. Acuracy for: pug=0.8695652173913043 89. Acuracy for: redbone=0.6226415094339622 90. Acuracy for: rhodesian_ridgeback=0.7352941176470589 91. Acuracy for: rottweiler=0.875 92. Acuracy for: saint_bernard=0.8888888888888888 93. Acuracy for: saluki=0.7066666666666667 94. Acuracy for: samoyed=0.9069767441860465 95. Acuracy for: schipperke=0.9420289855072463 96. Acuracy for: scotch_terrier=0.9333333333333333 97. Acuracy for: scottish_deerhound=0.9425287356321839 98. Acuracy for: sealyham_terrier=0.8387096774193549 99. Acuracy for: shetland_sheepdog=0.6444444444444445 100. Acuracy for: shih-tzu=0.7297297297297297 101. Acuracy for: siberian_husky=0.8095238095238095 102. Acuracy for: silky_terrier=0.810126582278481 103. Acuracy for: soft-coated_wheaten_terrier=0.6567164179104478 104. Acuracy for: staffordshire_bullterrier=0.8032786885245902 105. Acuracy for: standard_poodle=0.5 106. Acuracy for: standard_schnauzer=0.5081967213114754 107. Acuracy for: sussex_spaniel=0.8305084745762712 108. Acuracy for: tibetan_mastiff=0.7735849056603774 109. Acuracy for: tibetan_terrier=0.782608695652174 110. Acuracy for: toy_poodle=0.5833333333333334 111. Acuracy for: toy_terrier=0.8 112. Acuracy for: vizsla=0.8888888888888888 113. Acuracy for: walker_hound=0.6724137931034483 114. Acuracy for: weimaraner=0.8596491228070176 115. Acuracy for: welsh_springer_spaniel=0.7954545454545454 116. Acuracy for: west_highland_white_terrier=0.8524590163934426 117. Acuracy for: whippet=0.6 118. Acuracy for: wire-haired_fox_terrier=0.75 119. Acuracy for: yorkshire_terrier=0.704225352112676
We can look at the heatmap and see that eskimo_dog where classified 26% of the time as malamute, and they are indeed look very simillar.
We indeed suspected that whould happen becauuse the dogs are reallly simillar
from sklearn.utils import shuffle
i = 1
f = plt.figure()
df_copy = shuffle(df)
for index,row in df_copy.loc[df_copy['breed'] == 'eskimo_dog'].head(2).iterrows():
f.add_subplot(1,2, i)
i+=1
plt.imshow(mpimg.imread(row['path_img']))
i= 1
f = plt.figure()
for index,row in df_copy.loc[df_copy['breed'] == 'malamute'].head(2).iterrows():
f.add_subplot(1,2, i)
i+=1
plt.imshow(mpimg.imread(row['path_img']))
import pickle
filename = 'RF_model_vgg_400.sav'
pickle.dump(RF_model_vgg_400, open(filename, 'wb'))
from xgboost import XGBClassifier
XGB_classifier = XGBClassifier()
XGB_classifier.fit(X_train, y_train)
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\xgboost\sklearn.py:1146: UserWarning: The use of label encoder in XGBClassifier is deprecated and will be removed in a future release. To remove this warning, do the following: 1) Pass option use_label_encoder=False when constructing XGBClassifier object; and 2) Encode your labels (y) as integers starting with 0, i.e. 0, 1, 2, ..., [num_class - 1]. warnings.warn(label_encoder_deprecation_msg, UserWarning)
[13:56:37] WARNING: C:/Users/Administrator/workspace/xgboost-win64_release_1.4.0/src/learner.cc:1095: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
importance_type='gain', interaction_constraints='',
learning_rate=0.300000012, max_delta_step=0, max_depth=6,
min_child_weight=1, missing=nan, monotone_constraints='()',
n_estimators=100, n_jobs=4, num_parallel_tree=1,
objective='multi:softprob', random_state=0, reg_alpha=0,
reg_lambda=1, scale_pos_weight=None, subsample=1,
tree_method='exact', validate_parameters=1, verbosity=None)
predictions_XGB_vgg = XGB_classifier.predict(X_test)
print ("Accuracy = ", metrics.accuracy_score(y_test, predictions_XGB_vgg))
Accuracy = 0.7245204228109096
With XGboost we got slighly less accurate results even though it trained for much longer.
Maybe we could increase the amount of estimators but we decided to try something else as the trainig time is very long.
cm = confusion_matrix(y_test, predictions_XGB_vgg)
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm, annot=True, ax=ax)
plt.show()
filename = 'XGB_classifier.sav'
pickle.dump(XGB_classifier, open(filename, 'wb'))
Now we will reorganize the data for easier access from Imagedatagenerator
from sklearn.utils import shuffle
df = shuffle(df)
Create directory for every breed
for breed in df['breed']:
mkdirIfNotExist(os.path.join(TRAIN_PATH, breed))
mkdirIfNotExist(os.path.join(VALID_PATH, breed))
Copy all Images to its corresponding directory
for breed in df['breed']:
train_ratio = 0.8
fnames = df[df['breed'] == breed]['path_img']
idx = int(len(fnames)*(1-train_ratio))
val_fnames = fnames[:idx]
train_fnames = fnames[idx:]
copyIfNotExist(train_fnames, KAGGLE_PATH, os.path.join(TRAIN_PATH,breed))
copyIfNotExist(val_fnames, KAGGLE_PATH, os.path.join(VALID_PATH,breed))
We are using data augmantation to artificially expand the size of a training dataset by creating modified versions of images in the dataset
from keras.preprocessing.image import ImageDataGenerator
img_width ,img_height= 224, 224
batch_size = 16
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
shear_range=0.1,
zoom_range=0.15
)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
TRAIN_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
total_train_image_count = train_generator.samples
class_count = train_generator.num_classes
validation_generator = test_datagen.flow_from_directory(
VALID_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
total_val_image_count = train_generator.samples
Found 24592 images belonging to 120 classes. Found 6058 images belonging to 120 classes.
import matplotlib.pyplot as plt
train_first_dir = os.path.join(TRAIN_PATH, df['breed'][180])
fnames = [os.path.join(train_first_dir, fname) for fname in os.listdir(train_first_dir)]
img_path = fnames[4]
img = image.load_img(img_path, target_size=(img_width, img_height))
x = image.img_to_array(img)
x = x.reshape((1,) + x.shape)
i = 0
for batch in train_datagen.flow(x, batch_size=1):
plt.figure(i)
imgplot = plt.imshow(image.array_to_img(batch[0]))
i += 1
if i % 4 == 0:
break
plt.show()
We can see that after we apply the augmantation the dog image is rotated, zoomed and sheared in different ways
This allows us to get more from less data
It alows us to give the model slightly diffrent data for each epoch to avoid overfitting
We will first try our own CNN and see if a simple CNN can accuretly predict the dog breeds
Here we define a neural network with 7 convolution layers and one fully connected one
from keras import layers, models, regularizers, optimizers
from keras.models import Sequential, Model
from keras.layers import Dense, Conv2D, MaxPool2D , Flatten,Dropout
from keras.layers.normalization import BatchNormalization
model_1 = Sequential()
model_1.add(Conv2D(input_shape=(img_height,img_height,3),filters=32,kernel_size=(3,3),padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model_1.add(BatchNormalization())
model_1.add(MaxPool2D((2, 2)))
model_1.add(Flatten())
model_1.add(Dense(units=128,activation="relu"))
model_1.add(BatchNormalization())
model_1.add(Dense(class_count, activation="softmax"))
We are using the Agam optimizer
from keras.optimizers import Adam
from keras import optimizers
opt = Adam(lr=0.002)
model_1.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['acc'])
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\optimizer_v2\optimizer_v2.py:374: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead. warnings.warn(
model_1.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 224, 224, 32) 896 _________________________________________________________________ batch_normalization (BatchNo (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_1 (Conv2D) (None, 224, 224, 32) 9248 _________________________________________________________________ batch_normalization_1 (Batch (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_2 (Conv2D) (None, 224, 224, 32) 9248 _________________________________________________________________ batch_normalization_2 (Batch (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_3 (Conv2D) (None, 224, 224, 64) 18496 _________________________________________________________________ batch_normalization_3 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_4 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ batch_normalization_4 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_5 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ batch_normalization_5 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_6 (Conv2D) (None, 224, 224, 128) 73856 _________________________________________________________________ batch_normalization_6 (Batch (None, 224, 224, 128) 512 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 112, 112, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 1605632) 0 _________________________________________________________________ dense (Dense) (None, 128) 205521024 _________________________________________________________________ batch_normalization_7 (Batch (None, 128) 512 _________________________________________________________________ dense_1 (Dense) (None, 120) 15480 ================================================================= Total params: 205,724,280 Trainable params: 205,723,192 Non-trainable params: 1,088 _________________________________________________________________
from time import strftime
steps_per_epoch = train_generator.samples / batch_size
validation_steps = validation_generator.samples / batch_size
history = model_1.fit_generator(
train_generator,
steps_per_epoch=steps_per_epoch,
epochs=50,
validation_data=validation_generator,
validation_steps=validation_steps
)
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\engine\training.py:1940: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators.
warnings.warn('`Model.fit_generator` is deprecated and '
Epoch 1/50 1537/1537 [==============================] - 480s 311ms/step - loss: 4.5783 - acc: 0.0314 - val_loss: 4.3067 - val_acc: 0.0589 Epoch 2/50 1537/1537 [==============================] - 477s 310ms/step - loss: 4.3881 - acc: 0.0479 - val_loss: 4.2238 - val_acc: 0.0632 Epoch 3/50 1537/1537 [==============================] - 475s 309ms/step - loss: 4.2766 - acc: 0.0586 - val_loss: 4.2613 - val_acc: 0.0621 Epoch 4/50 1537/1537 [==============================] - 475s 309ms/step - loss: 4.1518 - acc: 0.0732 - val_loss: 4.1453 - val_acc: 0.0759 Epoch 5/50 1537/1537 [==============================] - 474s 308ms/step - loss: 4.0092 - acc: 0.0917 - val_loss: 4.0407 - val_acc: 0.0948 Epoch 6/50 1537/1537 [==============================] - 474s 308ms/step - loss: 3.8972 - acc: 0.1077 - val_loss: 7.0275 - val_acc: 0.0461 Epoch 7/50 1537/1537 [==============================] - 474s 308ms/step - loss: 3.7879 - acc: 0.1258 - val_loss: 6.3568 - val_acc: 0.0703 Epoch 8/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.6988 - acc: 0.1361 - val_loss: 4.8475 - val_acc: 0.0399 Epoch 9/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.6149 - acc: 0.1496 - val_loss: 3.6105 - val_acc: 0.1525 Epoch 10/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.5291 - acc: 0.1629 - val_loss: 3.7227 - val_acc: 0.1593 Epoch 11/50 1537/1537 [==============================] - 474s 308ms/step - loss: 3.4548 - acc: 0.1780 - val_loss: 3.3734 - val_acc: 0.1925 Epoch 12/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.3908 - acc: 0.1862 - val_loss: 3.9160 - val_acc: 0.1349 Epoch 13/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.3228 - acc: 0.2002 - val_loss: 3.5024 - val_acc: 0.1789 Epoch 14/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.2645 - acc: 0.2112 - val_loss: 3.6350 - val_acc: 0.1750 Epoch 15/50 1537/1537 [==============================] - 474s 309ms/step - loss: 3.2048 - acc: 0.2210 - val_loss: 3.5184 - val_acc: 0.2057 Epoch 16/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.1654 - acc: 0.2274 - val_loss: 3.2544 - val_acc: 0.2268 Epoch 17/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.1189 - acc: 0.2354 - val_loss: 3.5270 - val_acc: 0.2078 Epoch 18/50 1537/1537 [==============================] - 474s 308ms/step - loss: 3.0806 - acc: 0.2407 - val_loss: 3.4905 - val_acc: 0.1971 Epoch 19/50 1537/1537 [==============================] - 474s 308ms/step - loss: 3.1676 - acc: 0.2294 - val_loss: 3.1234 - val_acc: 0.2453 Epoch 20/50 1537/1537 [==============================] - 473s 308ms/step - loss: 3.0031 - acc: 0.2565 - val_loss: 3.1355 - val_acc: 0.2466 Epoch 21/50 1537/1537 [==============================] - 473s 308ms/step - loss: 2.9622 - acc: 0.2631 - val_loss: 3.3219 - val_acc: 0.2129 Epoch 22/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.9224 - acc: 0.2727 - val_loss: 3.0993 - val_acc: 0.2585 Epoch 23/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.9089 - acc: 0.2747 - val_loss: 2.9652 - val_acc: 0.2686 Epoch 24/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.8675 - acc: 0.2854 - val_loss: 2.9588 - val_acc: 0.2737 Epoch 25/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.8383 - acc: 0.2891 - val_loss: 3.1875 - val_acc: 0.2394 Epoch 26/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.8202 - acc: 0.2931 - val_loss: 3.0275 - val_acc: 0.2643 Epoch 27/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.7670 - acc: 0.3006 - val_loss: 3.0191 - val_acc: 0.2710 Epoch 28/50 1537/1537 [==============================] - 473s 308ms/step - loss: 2.7453 - acc: 0.3079 - val_loss: 3.1084 - val_acc: 0.2663 Epoch 29/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.7280 - acc: 0.3110 - val_loss: 2.9379 - val_acc: 0.2803 Epoch 30/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.6967 - acc: 0.3158 - val_loss: 3.2037 - val_acc: 0.2633 Epoch 31/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.6813 - acc: 0.3211 - val_loss: 2.9028 - val_acc: 0.2940 Epoch 32/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.6577 - acc: 0.3230 - val_loss: 2.9995 - val_acc: 0.2843 Epoch 33/50 1537/1537 [==============================] - 474s 309ms/step - loss: 2.6266 - acc: 0.3289 - val_loss: 2.9567 - val_acc: 0.2948 Epoch 34/50 1537/1537 [==============================] - 474s 308ms/step - loss: 2.6075 - acc: 0.3346 - val_loss: 3.3343 - val_acc: 0.2521 Epoch 35/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.5672 - acc: 0.3419 - val_loss: 2.8138 - val_acc: 0.3082 Epoch 36/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.5743 - acc: 0.3423 - val_loss: 3.2682 - val_acc: 0.2448 Epoch 37/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.5302 - acc: 0.3494 - val_loss: 2.8972 - val_acc: 0.2998 Epoch 38/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.5045 - acc: 0.3590 - val_loss: 2.8564 - val_acc: 0.3074 Epoch 39/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.4940 - acc: 0.3582 - val_loss: 3.2611 - val_acc: 0.2671 Epoch 40/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.4718 - acc: 0.3636 - val_loss: 2.8478 - val_acc: 0.3052 Epoch 41/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.4571 - acc: 0.3619 - val_loss: 3.0054 - val_acc: 0.2780 Epoch 42/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.4350 - acc: 0.3709 - val_loss: 3.2834 - val_acc: 0.2631 Epoch 43/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.4160 - acc: 0.3736 - val_loss: 2.9138 - val_acc: 0.2994 Epoch 44/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.3853 - acc: 0.3792 - val_loss: 2.7592 - val_acc: 0.3293 Epoch 45/50 1537/1537 [==============================] - 473s 308ms/step - loss: 2.3834 - acc: 0.3794 - val_loss: 2.9918 - val_acc: 0.2811 Epoch 46/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.3635 - acc: 0.3844 - val_loss: 2.7384 - val_acc: 0.3230 Epoch 47/50 1537/1537 [==============================] - 475s 309ms/step - loss: 2.3412 - acc: 0.3889 - val_loss: 2.6788 - val_acc: 0.3424 Epoch 48/50 1537/1537 [==============================] - 473s 307ms/step - loss: 2.3228 - acc: 0.3942 - val_loss: 5.8100 - val_acc: 0.1636 Epoch 49/50 1537/1537 [==============================] - 472s 307ms/step - loss: 2.3193 - acc: 0.3935 - val_loss: 2.6769 - val_acc: 0.3455 Epoch 50/50 1537/1537 [==============================] - 473s 308ms/step - loss: 2.2776 - acc: 0.4037 - val_loss: 2.8006 - val_acc: 0.3244
We trained for 50 epoch (The model has seen the data 50 times) and we see that the best accuracy on the validation set is at max 35%
Which is worse than our other models
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo')
plt.plot(epochs, val_acc, 'b')
plt.title('Training and validation accuracy')
plt.figure()
plt.plot(epochs, loss, 'bo')
plt.plot(epochs, val_loss, 'b')
plt.title('Training and validation loss')
plt.show()
We can see that we are overfitting to the data as the training accuracy line (dotted) is diverging from the validation line (continuous)
Y_pred = model_1.predict_generator(validation_generator, validation_steps)
y_pred = np.argmax(Y_pred, axis=1)
cm_cnn = confusion_matrix(validation_generator.classes, y_pred)
print('Classification Report')
print(classification_report(validation_generator.classes, y_pred, target_names=labels.values()))
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\engine\training.py:2001: UserWarning: `Model.predict_generator` is deprecated and will be removed in a future version. Please use `Model.predict`, which supports generators.
warnings.warn('`Model.predict_generator` is deprecated and '
Classification Report
precision recall f1-score support
affenpinscher 0.39 0.16 0.22 45
afghan_hound 0.28 0.61 0.38 70
african_hunting_dog 0.74 0.40 0.52 50
airedale 0.38 0.38 0.38 61
american_staffordshire_terrier 0.25 0.17 0.20 47
appenzeller 0.47 0.36 0.41 45
australian_terrier 0.60 0.20 0.30 59
basenji 0.17 0.59 0.27 63
basset 0.15 0.08 0.10 51
beagle 0.26 0.19 0.22 59
bedlington_terrier 0.43 0.46 0.45 54
bernese_mountain_dog 0.65 0.55 0.60 66
black-and-tan_coonhound 0.56 0.38 0.46 47
blenheim_spaniel 0.74 0.60 0.66 57
bloodhound 0.24 0.26 0.25 54
bluetick 0.56 0.18 0.27 51
border_collie 0.44 0.50 0.47 44
border_terrier 0.28 0.17 0.21 52
borzoi 0.33 0.22 0.27 45
boston_bull 0.57 0.40 0.47 53
bouvier_des_flandres 0.52 0.30 0.38 47
boxer 0.18 0.04 0.07 45
brabancon_griffon 0.67 0.19 0.29 43
briard 0.27 0.26 0.26 43
brittany_spaniel 0.55 0.14 0.22 44
bull_mastiff 0.30 0.30 0.30 46
cairn 0.23 0.17 0.19 60
cardigan 0.23 0.22 0.22 46
chesapeake_bay_retriever 0.15 0.55 0.23 49
chihuahua 0.00 0.00 0.00 14
chow 0.35 0.35 0.35 57
clumber 0.30 0.56 0.39 45
cocker_spaniel 0.19 0.22 0.20 46
collie 0.22 0.09 0.12 47
curly-coated_retriever 0.34 0.34 0.34 44
dandie_dinmont 0.54 0.40 0.46 53
dhole 0.50 0.51 0.51 45
dingo 0.31 0.36 0.34 47
doberman 0.16 0.50 0.24 44
english_foxhound 0.62 0.33 0.43 48
english_setter 0.43 0.27 0.33 48
english_springer 0.37 0.39 0.38 46
entlebucher 0.75 0.57 0.65 63
eskimo_dog 0.16 0.16 0.16 43
flat-coated_retriever 0.22 0.32 0.26 44
french_bulldog 0.17 0.16 0.16 45
german_shepherd 0.42 0.11 0.18 44
german_short-haired_pointer 0.33 0.33 0.33 45
giant_schnauzer 0.15 0.27 0.19 45
golden_retriever 0.58 0.16 0.25 43
gordon_setter 0.69 0.20 0.31 46
great_dane 0.26 0.13 0.17 46
great_pyrenees 0.19 0.41 0.26 64
greater_swiss_mountain_dog 0.46 0.39 0.42 49
groenendael 0.57 0.35 0.43 46
ibizan_hound 0.26 0.51 0.34 55
irish_setter 0.73 0.40 0.51 48
irish_terrier 0.40 0.16 0.23 50
irish_water_spaniel 0.78 0.47 0.58 45
irish_wolfhound 0.15 0.17 0.16 63
italian_greyhound 0.19 0.35 0.25 54
japanese_spaniel 0.62 0.40 0.49 57
keeshond 0.79 0.55 0.65 47
kelpie 0.12 0.09 0.10 47
kerry_blue_terrier 0.79 0.29 0.42 52
komondor 0.29 0.52 0.37 44
kuvasz 0.31 0.45 0.37 44
labrador_retriever 0.20 0.30 0.24 50
lakeland_terrier 0.20 0.29 0.23 59
leonberg 0.67 0.51 0.58 63
lhasa 0.17 0.15 0.16 55
malamute 0.44 0.33 0.38 51
malinois 0.28 0.25 0.27 44
maltese_dog 0.36 0.45 0.40 73
mexican_hairless 0.55 0.52 0.53 46
miniature_pinscher 0.46 0.21 0.29 57
miniature_poodle 0.20 0.20 0.20 46
miniature_schnauzer 0.38 0.17 0.24 46
newfoundland 0.20 0.26 0.23 57
norfolk_terrier 0.31 0.24 0.27 50
norwegian_elkhound 0.38 0.71 0.49 58
norwich_terrier 0.39 0.13 0.20 52
old_english_sheepdog 0.43 0.18 0.25 51
otterhound 0.67 0.28 0.39 43
papillon 0.47 0.36 0.41 58
pekinese 0.32 0.23 0.27 44
pembroke 0.57 0.48 0.52 54
pomeranian 0.38 0.32 0.35 65
pug 0.27 0.12 0.17 58
redbone 0.27 0.33 0.30 43
rhodesian_ridgeback 0.29 0.18 0.22 51
rottweiler 0.42 0.18 0.25 45
saint_bernard 0.64 0.54 0.59 50
saluki 0.27 0.32 0.29 59
samoyed 0.35 0.65 0.45 65
schipperke 0.46 0.38 0.42 47
scotch_terrier 0.33 0.17 0.23 47
scottish_deerhound 0.37 0.35 0.36 71
sealyham_terrier 0.38 0.67 0.48 57
shetland_sheepdog 0.42 0.33 0.37 46
shih-tzu 0.30 0.28 0.29 65
siberian_husky 0.12 0.12 0.12 57
silky_terrier 0.56 0.28 0.37 54
soft-coated_wheaten_terrier 0.29 0.49 0.37 45
staffordshire_bullterrier 0.25 0.20 0.22 46
standard_poodle 0.08 0.02 0.03 47
standard_schnauzer 0.28 0.22 0.25 45
sussex_spaniel 0.42 0.69 0.53 45
tibetan_mastiff 0.53 0.23 0.32 44
tibetan_terrier 0.27 0.31 0.29 62
toy_poodle 0.15 0.09 0.11 46
toy_terrier 0.38 0.16 0.23 50
vizsla 0.22 0.57 0.32 44
walker_hound 0.38 0.23 0.29 44
weimaraner 0.16 0.65 0.26 48
welsh_springer_spaniel 0.44 0.38 0.40 45
west_highland_white_terrier 0.29 0.43 0.35 49
whippet 0.18 0.30 0.22 56
wire-haired_fox_terrier 0.47 0.32 0.38 47
yorkshire_terrier 0.25 0.08 0.12 49
accuracy 0.32 6058
macro avg 0.37 0.32 0.32 6058
weighted avg 0.38 0.32 0.32 6058
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm_cnn, annot=True, ax=ax)
plt.show()
Using pretrained Xception network and then we finetune it on our own data
from keras.applications.xception import Xception
conv_base = Xception(weights='imagenet',
include_top=False,
input_shape=(img_width, img_height, 3))
conv_base.trainable = True
Defining the neural network.
The first layers are the layers of the Xception network pretrained on imagenet, and then we have a fully connected layer with 512 nodes
from keras import layers, models, regularizers, optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense, Dropout
model_2 = models.Sequential()
model_2.add(conv_base)
model_2.add(layers.Flatten())
model_2.add(layers.Dense(512, activation='relu'))
model_2.add(layers.Dense(class_count, activation='sigmoid'))
model_2.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.90),
metrics=['acc'])
model_2.summary()
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= xception (Functional) (None, 7, 7, 2048) 20861480 _________________________________________________________________ flatten_1 (Flatten) (None, 100352) 0 _________________________________________________________________ dense_2 (Dense) (None, 512) 51380736 _________________________________________________________________ dense_3 (Dense) (None, 120) 61560 ================================================================= Total params: 72,303,776 Trainable params: 72,249,248 Non-trainable params: 54,528 _________________________________________________________________
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\optimizer_v2\optimizer_v2.py:374: UserWarning: The `lr` argument is deprecated, use `learning_rate` instead. warnings.warn(
from time import strftime
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
earlyStopping = EarlyStopping(monitor='val_loss', patience=10, verbose=0, mode='min')
mcp_save = ModelCheckpoint('.mdl_wts.hdf5', save_best_only=True, monitor='val_loss', mode='min')
reduce_lr_loss = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=7, verbose=1, epsilon=1e-4, mode='min')
steps_per_epoch = train_generator.samples / 16
validation_steps = validation_generator.samples / 16
history_2 = model_2.fit_generator(
train_generator,
steps_per_epoch=steps_per_epoch,
epochs=50,
callbacks=[earlyStopping, mcp_save, reduce_lr_loss],
validation_data=validation_generator,
validation_steps=validation_steps
)
WARNING:tensorflow:`epsilon` argument is deprecated and will be removed, use `min_delta` instead. Epoch 1/50 1537/1537 [==============================] - 470s 306ms/step - loss: 1.0279 - acc: 0.7098 - val_loss: 0.6413 - val_acc: 0.8079
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\utils\generic_utils.py:494: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
warnings.warn('Custom mask layers require a config and must override '
Epoch 2/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.8877 - acc: 0.7394 - val_loss: 0.5820 - val_acc: 0.8206 Epoch 3/50 1537/1537 [==============================] - 471s 306ms/step - loss: 0.7905 - acc: 0.7679 - val_loss: 0.5447 - val_acc: 0.8331 Epoch 4/50 1537/1537 [==============================] - 526s 342ms/step - loss: 0.7315 - acc: 0.7787 - val_loss: 0.5222 - val_acc: 0.8453 Epoch 5/50 1537/1537 [==============================] - 576s 375ms/step - loss: 0.6695 - acc: 0.7970 - val_loss: 0.5120 - val_acc: 0.8415 Epoch 6/50 1537/1537 [==============================] - 553s 360ms/step - loss: 0.6241 - acc: 0.8097 - val_loss: 0.4876 - val_acc: 0.8526 Epoch 7/50 1537/1537 [==============================] - 580s 378ms/step - loss: 0.5721 - acc: 0.8223 - val_loss: 0.4794 - val_acc: 0.8521 Epoch 8/50 1537/1537 [==============================] - 563s 366ms/step - loss: 0.5514 - acc: 0.8289 - val_loss: 0.4662 - val_acc: 0.8585 Epoch 9/50 1537/1537 [==============================] - 581s 378ms/step - loss: 0.5128 - acc: 0.8392 - val_loss: 0.4447 - val_acc: 0.8666 Epoch 10/50 1537/1537 [==============================] - 572s 372ms/step - loss: 0.4790 - acc: 0.8463 - val_loss: 0.4446 - val_acc: 0.8681 Epoch 11/50 1537/1537 [==============================] - 518s 337ms/step - loss: 0.4481 - acc: 0.8589 - val_loss: 0.4276 - val_acc: 0.8764 Epoch 12/50 1537/1537 [==============================] - 478s 311ms/step - loss: 0.4200 - acc: 0.8686 - val_loss: 0.4176 - val_acc: 0.8726 Epoch 13/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.4041 - acc: 0.8726 - val_loss: 0.4206 - val_acc: 0.8777 Epoch 14/50 1537/1537 [==============================] - 476s 309ms/step - loss: 0.3848 - acc: 0.8761 - val_loss: 0.4130 - val_acc: 0.8810 Epoch 15/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.3590 - acc: 0.8866 - val_loss: 0.4047 - val_acc: 0.8802 Epoch 16/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.3449 - acc: 0.8916 - val_loss: 0.4016 - val_acc: 0.8820 Epoch 17/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.3317 - acc: 0.8937 - val_loss: 0.3948 - val_acc: 0.8843 Epoch 18/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.3190 - acc: 0.8977 - val_loss: 0.3864 - val_acc: 0.8887 Epoch 19/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.2970 - acc: 0.9068 - val_loss: 0.3815 - val_acc: 0.8899 Epoch 20/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.2866 - acc: 0.9082 - val_loss: 0.3985 - val_acc: 0.8866 Epoch 21/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.2709 - acc: 0.9135 - val_loss: 0.3823 - val_acc: 0.8882 Epoch 22/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.2628 - acc: 0.9168 - val_loss: 0.3829 - val_acc: 0.8930 Epoch 23/50 1537/1537 [==============================] - 471s 306ms/step - loss: 0.2506 - acc: 0.9210 - val_loss: 0.3826 - val_acc: 0.8935 Epoch 24/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.2449 - acc: 0.9224 - val_loss: 0.3836 - val_acc: 0.8955 Epoch 25/50 1537/1537 [==============================] - 471s 306ms/step - loss: 0.2330 - acc: 0.9232 - val_loss: 0.3820 - val_acc: 0.8972 Epoch 26/50 1537/1537 [==============================] - 473s 307ms/step - loss: 0.2228 - acc: 0.9284 - val_loss: 0.3792 - val_acc: 0.8962 Epoch 27/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.2163 - acc: 0.9315 - val_loss: 0.3649 - val_acc: 0.9003 Epoch 28/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.2036 - acc: 0.9347 - val_loss: 0.3854 - val_acc: 0.8939 Epoch 29/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.2047 - acc: 0.9338 - val_loss: 0.3713 - val_acc: 0.9000 Epoch 30/50 1537/1537 [==============================] - 473s 307ms/step - loss: 0.1898 - acc: 0.9405 - val_loss: 0.3737 - val_acc: 0.8991 Epoch 31/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.1893 - acc: 0.9390 - val_loss: 0.3776 - val_acc: 0.9016 Epoch 32/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.1761 - acc: 0.9444 - val_loss: 0.3755 - val_acc: 0.9021 Epoch 33/50 1537/1537 [==============================] - 474s 308ms/step - loss: 0.1754 - acc: 0.9434 - val_loss: 0.3650 - val_acc: 0.9029 Epoch 34/50 1537/1537 [==============================] - 473s 307ms/step - loss: 0.1741 - acc: 0.9454 - val_loss: 0.3793 - val_acc: 0.9034 Epoch 00034: ReduceLROnPlateau reducing learning rate to 9.999999747378752e-06. Epoch 35/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.1554 - acc: 0.9518 - val_loss: 0.3649 - val_acc: 0.9062 Epoch 36/50 1537/1537 [==============================] - 472s 307ms/step - loss: 0.1465 - acc: 0.9551 - val_loss: 0.3664 - val_acc: 0.9082 Epoch 37/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.1478 - acc: 0.9534 - val_loss: 0.3643 - val_acc: 0.9064 Epoch 38/50 1537/1537 [==============================] - 476s 310ms/step - loss: 0.1461 - acc: 0.9544 - val_loss: 0.3627 - val_acc: 0.9064 Epoch 39/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.1464 - acc: 0.9539 - val_loss: 0.3682 - val_acc: 0.9059 Epoch 40/50 1537/1537 [==============================] - 475s 309ms/step - loss: 0.1466 - acc: 0.9544 - val_loss: 0.3657 - val_acc: 0.9077 Epoch 41/50 1537/1537 [==============================] - 473s 308ms/step - loss: 0.1407 - acc: 0.9553 - val_loss: 0.3650 - val_acc: 0.9082 Epoch 42/50 1537/1537 [==============================] - 479s 311ms/step - loss: 0.1400 - acc: 0.9570 - val_loss: 0.3645 - val_acc: 0.9074 Epoch 43/50 1537/1537 [==============================] - 490s 319ms/step - loss: 0.1347 - acc: 0.9587 - val_loss: 0.3657 - val_acc: 0.9074 Epoch 44/50 1537/1537 [==============================] - 484s 315ms/step - loss: 0.1358 - acc: 0.9567 - val_loss: 0.3676 - val_acc: 0.9079 Epoch 45/50 1537/1537 [==============================] - 499s 324ms/step - loss: 0.1322 - acc: 0.9593 - val_loss: 0.3683 - val_acc: 0.9076 Epoch 00045: ReduceLROnPlateau reducing learning rate to 9.999999747378752e-07. Epoch 46/50 1537/1537 [==============================] - 493s 320ms/step - loss: 0.1368 - acc: 0.9578 - val_loss: 0.3673 - val_acc: 0.9084 Epoch 47/50 1537/1537 [==============================] - 488s 318ms/step - loss: 0.1399 - acc: 0.9559 - val_loss: 0.3672 - val_acc: 0.9077 Epoch 48/50 1537/1537 [==============================] - 479s 311ms/step - loss: 0.1390 - acc: 0.9553 - val_loss: 0.3673 - val_acc: 0.9079
import matplotlib.pyplot as plt
acc = history_2.history['acc']
val_acc = history_2.history['val_acc']
loss = history_2.history['loss']
val_loss = history_2.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo')
plt.plot(epochs, val_acc, 'b')
plt.title('Training and validation accuracy')
plt.figure()
plt.plot(epochs, loss, 'bo')
plt.plot(epochs, val_loss, 'b')
plt.title('Training and validation loss')
plt.show()
Y_pred = model_2.predict_generator(validation_generator, validation_steps)
y_pred = np.argmax(Y_pred, axis=1)
cm_xception = confusion_matrix(validation_generator.classes, y_pred)
print('Classification Report')
print(classification_report(validation_generator.classes, y_pred, target_names=labels.values()))
C:\Users\katunv\anaconda3\envs\dognewenv\lib\site-packages\tensorflow\python\keras\engine\training.py:2001: UserWarning: `Model.predict_generator` is deprecated and will be removed in a future version. Please use `Model.predict`, which supports generators.
warnings.warn('`Model.predict_generator` is deprecated and '
Classification Report
precision recall f1-score support
affenpinscher 0.92 1.00 0.96 45
afghan_hound 0.93 0.97 0.95 70
african_hunting_dog 0.98 1.00 0.99 50
airedale 0.92 0.89 0.90 61
american_staffordshire_terrier 0.77 0.77 0.77 47
appenzeller 0.83 0.78 0.80 45
australian_terrier 0.85 0.90 0.88 59
basenji 0.98 0.90 0.94 63
basset 0.91 0.96 0.93 51
beagle 0.87 0.90 0.88 59
bedlington_terrier 1.00 0.98 0.99 54
bernese_mountain_dog 0.97 0.97 0.97 66
black-and-tan_coonhound 0.94 0.94 0.94 47
blenheim_spaniel 0.97 0.98 0.97 57
bloodhound 0.93 0.96 0.95 54
bluetick 0.88 0.86 0.87 51
border_collie 0.89 0.93 0.91 44
border_terrier 0.96 0.88 0.92 52
borzoi 0.92 1.00 0.96 45
boston_bull 0.85 1.00 0.92 53
bouvier_des_flandres 0.98 0.91 0.95 47
boxer 0.89 0.89 0.89 45
brabancon_griffon 0.98 1.00 0.99 43
briard 0.88 0.88 0.88 43
brittany_spaniel 0.88 0.86 0.87 44
bull_mastiff 0.90 0.96 0.93 46
cairn 0.96 0.92 0.94 60
cardigan 0.88 0.93 0.91 46
chesapeake_bay_retriever 0.79 0.98 0.87 49
chihuahua 0.73 0.57 0.64 14
chow 0.97 1.00 0.98 57
clumber 0.96 0.98 0.97 45
cocker_spaniel 0.98 0.87 0.92 46
collie 0.86 0.81 0.84 47
curly-coated_retriever 0.98 0.93 0.95 44
dandie_dinmont 0.95 1.00 0.97 53
dhole 0.94 0.98 0.96 45
dingo 0.91 0.85 0.88 47
doberman 0.86 0.95 0.90 44
english_foxhound 0.83 0.83 0.83 48
english_setter 1.00 0.96 0.98 48
english_springer 0.83 0.96 0.89 46
entlebucher 0.91 0.94 0.92 63
eskimo_dog 0.60 0.56 0.58 43
flat-coated_retriever 0.95 0.95 0.95 44
french_bulldog 0.93 0.91 0.92 45
german_shepherd 0.95 0.93 0.94 44
german_short-haired_pointer 0.95 0.78 0.85 45
giant_schnauzer 0.84 0.91 0.87 45
golden_retriever 0.89 0.98 0.93 43
gordon_setter 0.98 0.98 0.98 46
great_dane 0.95 0.91 0.93 46
great_pyrenees 0.82 0.95 0.88 64
greater_swiss_mountain_dog 0.90 0.92 0.91 49
groenendael 0.98 0.96 0.97 46
ibizan_hound 0.90 0.95 0.92 55
irish_setter 1.00 0.92 0.96 48
irish_terrier 1.00 0.82 0.90 50
irish_water_spaniel 0.96 0.96 0.96 45
irish_wolfhound 0.72 0.95 0.82 63
italian_greyhound 0.80 0.89 0.84 54
japanese_spaniel 0.90 0.96 0.93 57
keeshond 0.98 1.00 0.99 47
kelpie 0.85 0.85 0.85 47
kerry_blue_terrier 0.96 0.88 0.92 52
komondor 0.96 1.00 0.98 44
kuvasz 0.97 0.82 0.89 44
labrador_retriever 0.96 0.86 0.91 50
lakeland_terrier 0.87 0.81 0.84 59
leonberg 0.97 0.92 0.94 63
lhasa 0.88 0.84 0.86 55
malamute 0.69 0.84 0.76 51
malinois 0.91 0.95 0.93 44
maltese_dog 0.95 0.86 0.91 73
mexican_hairless 0.98 0.96 0.97 46
miniature_pinscher 0.89 0.86 0.88 57
miniature_poodle 0.73 0.70 0.71 46
miniature_schnauzer 0.83 0.93 0.88 46
newfoundland 0.89 0.96 0.92 57
norfolk_terrier 0.79 0.74 0.76 50
norwegian_elkhound 0.98 0.95 0.96 58
norwich_terrier 0.84 0.83 0.83 52
old_english_sheepdog 0.98 0.96 0.97 51
otterhound 0.95 0.88 0.92 43
papillon 0.96 0.95 0.96 58
pekinese 0.95 0.93 0.94 44
pembroke 0.88 0.85 0.87 54
pomeranian 0.94 0.94 0.94 65
pug 0.98 0.93 0.96 58
redbone 0.93 0.88 0.90 43
rhodesian_ridgeback 0.85 0.80 0.83 51
rottweiler 0.88 0.96 0.91 45
saint_bernard 0.98 1.00 0.99 50
saluki 0.93 0.95 0.94 59
samoyed 0.97 0.92 0.94 65
schipperke 0.98 0.96 0.97 47
scotch_terrier 0.94 1.00 0.97 47
scottish_deerhound 0.96 0.73 0.83 71
sealyham_terrier 1.00 0.96 0.98 57
shetland_sheepdog 0.93 0.87 0.90 46
shih-tzu 0.86 0.92 0.89 65
siberian_husky 0.73 0.61 0.67 57
silky_terrier 0.85 0.96 0.90 54
soft-coated_wheaten_terrier 0.86 0.93 0.89 45
staffordshire_bullterrier 0.81 0.83 0.82 46
standard_poodle 0.88 0.81 0.84 47
standard_schnauzer 0.94 0.64 0.76 45
sussex_spaniel 1.00 0.96 0.98 45
tibetan_mastiff 0.98 0.95 0.97 44
tibetan_terrier 0.84 0.98 0.90 62
toy_poodle 0.80 0.80 0.80 46
toy_terrier 0.91 0.82 0.86 50
vizsla 0.91 0.95 0.93 44
walker_hound 0.82 0.73 0.77 44
weimaraner 0.94 0.98 0.96 48
welsh_springer_spaniel 0.95 0.87 0.91 45
west_highland_white_terrier 0.87 0.98 0.92 49
whippet 0.78 0.84 0.81 56
wire-haired_fox_terrier 0.90 0.96 0.93 47
yorkshire_terrier 0.85 0.71 0.78 49
accuracy 0.90 6058
macro avg 0.90 0.90 0.90 6058
weighted avg 0.91 0.90 0.90 6058
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm_xception, annot=True, ax=ax)
plt.show()
from keras.preprocessing.image import ImageDataGenerator
img_width ,img_height= 224, 224
batch_size = 16
train_datagen = ImageDataGenerator(
rescale=1./255,
)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
TRAIN_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
total_train_image_count = train_generator.samples
class_count = train_generator.num_classes
validation_generator = test_datagen.flow_from_directory(
VALID_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
total_val_image_count = train_generator.samples
Found 24592 images belonging to 120 classes. Found 6058 images belonging to 120 classes.
import matplotlib.pyplot as plt
from keras.preprocessing import image
train_first_dir = os.path.join(TRAIN_PATH, df['breed'][300])
fnames = [os.path.join(train_first_dir, fname) for fname in os.listdir(train_first_dir)]
img_path = fnames[4]
img = image.load_img(img_path, target_size=(img_width, img_height))
x = image.img_to_array(img)
x = x.reshape((1,) + x.shape)
i = 0
for batch in train_datagen.flow(x, batch_size=1):
plt.figure(i)
imgplot = plt.imshow(image.array_to_img(batch[0]))
i += 1
if i % 4 == 0:
break
plt.show()
First lets try to classify first with classical machine learning
We will first try our own CNN and see if a simple CNN can accuretly predict the dog breeds
from keras import layers, models, regularizers, optimizers
from keras.models import Sequential, Model
from keras.layers import Dense, Conv2D, MaxPool2D , Flatten,Dropout
from keras.layers.normalization import BatchNormalization
model_3 = Sequential()
model_3.add(Conv2D(input_shape=(img_height,img_height,3),filters=32,kernel_size=(3,3),padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=32, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=64, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))
model_3.add(BatchNormalization())
model_3.add(MaxPool2D((2, 2)))
model_3.add(Flatten())
model_3.add(Dense(units=128,activation="relu"))
model_3.add(BatchNormalization())
model_3.add(Dense(class_count, activation="softmax"))
from keras.optimizers import Adam
from keras import optimizers
opt = Adam(lr=0.002)
model_3.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['acc'])
model_3.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 224, 224, 32) 896 _________________________________________________________________ batch_normalization (BatchNo (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_1 (Conv2D) (None, 224, 224, 32) 9248 _________________________________________________________________ batch_normalization_1 (Batch (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_2 (Conv2D) (None, 224, 224, 32) 9248 _________________________________________________________________ batch_normalization_2 (Batch (None, 224, 224, 32) 128 _________________________________________________________________ conv2d_3 (Conv2D) (None, 224, 224, 64) 18496 _________________________________________________________________ batch_normalization_3 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_4 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ batch_normalization_4 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_5 (Conv2D) (None, 224, 224, 64) 36928 _________________________________________________________________ batch_normalization_5 (Batch (None, 224, 224, 64) 256 _________________________________________________________________ conv2d_6 (Conv2D) (None, 224, 224, 128) 73856 _________________________________________________________________ batch_normalization_6 (Batch (None, 224, 224, 128) 512 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 112, 112, 128) 0 _________________________________________________________________ flatten (Flatten) (None, 1605632) 0 _________________________________________________________________ dense (Dense) (None, 128) 205521024 _________________________________________________________________ batch_normalization_7 (Batch (None, 128) 512 _________________________________________________________________ dense_1 (Dense) (None, 120) 15480 ================================================================= Total params: 205,724,280 Trainable params: 205,723,192 Non-trainable params: 1,088 _________________________________________________________________
from time import strftime
steps_per_epoch = train_generator.samples / batch_size
validation_steps = validation_generator.samples / batch_size
history = model_3.fit_generator(
train_generator,
steps_per_epoch=steps_per_epoch,
epochs=10,
validation_data=validation_generator,
validation_steps=validation_steps
)
Epoch 1/10 1537/1537 [==============================] - 464s 299ms/step - loss: 4.5414 - acc: 0.0426 - val_loss: 4.2907 - val_acc: 0.0723 Epoch 2/10 1537/1537 [==============================] - 459s 299ms/step - loss: 3.8637 - acc: 0.1452 - val_loss: 3.6797 - val_acc: 0.2025 Epoch 3/10 1537/1537 [==============================] - 459s 298ms/step - loss: 2.1860 - acc: 0.5168 - val_loss: 2.6595 - val_acc: 0.5074 Epoch 4/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.2050 - acc: 0.9643 - val_loss: 2.8876 - val_acc: 0.5586 Epoch 5/10 1537/1537 [==============================] - 459s 299ms/step - loss: 0.0453 - acc: 0.9958 - val_loss: 2.7426 - val_acc: 0.5644 Epoch 6/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0416 - acc: 0.9956 - val_loss: 3.2863 - val_acc: 0.5598 Epoch 7/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.1239 - acc: 0.9713 - val_loss: 3.8647 - val_acc: 0.5322 Epoch 8/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.1462 - acc: 0.9627 - val_loss: 4.6165 - val_acc: 0.5474 Epoch 9/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0728 - acc: 0.9838 - val_loss: 3.4210 - val_acc: 0.5583 Epoch 10/10 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0588 - acc: 0.9893 - val_loss: 3.7338 - val_acc: 0.5555
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo')
plt.plot(epochs, val_acc, 'b')
plt.title('Training and validation accuracy')
plt.figure()
plt.plot(epochs, loss, 'bo')
plt.plot(epochs, val_loss, 'b')
plt.title('Training and validation loss')
plt.show()
We can see that without data augmantation we are quickly overfitting to the data, after 3 epcohs we already reached 96 precent accuracy on the training data
and only 50 on validation data, even though the results are better than the model with augmantation this model overfits and does not improve with epochs.
Using pretrained conv layer
from keras.applications.xception import Xception
conv_base = Xception(weights='imagenet',
include_top=False,
input_shape=(img_width, img_height, 3))
conv_base.trainable = True
defining the neural network
from keras import layers, models, regularizers, optimizers
from keras.models import Sequential, Model
from keras.layers import Flatten, Dense, Dropout
model_4 = models.Sequential()
model_4.add(conv_base)
model_4.add(layers.Flatten())
model_4.add(layers.Dense(512, activation='relu'))
model_4.add(layers.Dense(class_count, activation='sigmoid'))
model_4.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-4, momentum=0.90),
metrics=['acc'])
model_4.summary()
Model: "sequential_2" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= xception (Functional) (None, 7, 7, 2048) 20861480 _________________________________________________________________ flatten_2 (Flatten) (None, 100352) 0 _________________________________________________________________ dense_4 (Dense) (None, 512) 51380736 _________________________________________________________________ dense_5 (Dense) (None, 120) 61560 ================================================================= Total params: 72,303,776 Trainable params: 72,249,248 Non-trainable params: 54,528 _________________________________________________________________
from time import strftime
steps_per_epoch = train_generator.samples / batch_size
validation_steps = validation_generator.samples / batch_size
history = model_4.fit_generator(
train_generator,
steps_per_epoch=steps_per_epoch,
epochs=50,
validation_data=validation_generator,
validation_steps=validation_steps
)
Epoch 1/50 1537/1537 [==============================] - 462s 297ms/step - loss: 1.6020 - acc: 0.6269 - val_loss: 0.6879 - val_acc: 0.8090 Epoch 2/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.6068 - acc: 0.8223 - val_loss: 0.5452 - val_acc: 0.8450 Epoch 3/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.4094 - acc: 0.8784 - val_loss: 0.4801 - val_acc: 0.8656 Epoch 4/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.2872 - acc: 0.9172 - val_loss: 0.4276 - val_acc: 0.8803 Epoch 5/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.2113 - acc: 0.9416 - val_loss: 0.3890 - val_acc: 0.8944 Epoch 6/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.1588 - acc: 0.9587 - val_loss: 0.3748 - val_acc: 0.8993 Epoch 7/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.1261 - acc: 0.9688 - val_loss: 0.3658 - val_acc: 0.9018 Epoch 8/50 1537/1537 [==============================] - 459s 299ms/step - loss: 0.1038 - acc: 0.9748 - val_loss: 0.3679 - val_acc: 0.9024 Epoch 9/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0864 - acc: 0.9814 - val_loss: 0.3629 - val_acc: 0.9057 Epoch 10/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0737 - acc: 0.9845 - val_loss: 0.3628 - val_acc: 0.9029 Epoch 11/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0608 - acc: 0.9880 - val_loss: 0.3584 - val_acc: 0.9066 Epoch 12/50 1537/1537 [==============================] - 456s 297ms/step - loss: 0.0552 - acc: 0.9900 - val_loss: 0.3655 - val_acc: 0.9054 Epoch 13/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0486 - acc: 0.9917 - val_loss: 0.3632 - val_acc: 0.9049 Epoch 14/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0445 - acc: 0.9921 - val_loss: 0.3676 - val_acc: 0.9087 Epoch 15/50 1537/1537 [==============================] - 457s 298ms/step - loss: 0.0389 - acc: 0.9939 - val_loss: 0.3643 - val_acc: 0.9082 Epoch 16/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0359 - acc: 0.9939 - val_loss: 0.3683 - val_acc: 0.9095 Epoch 17/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0345 - acc: 0.9943 - val_loss: 0.3700 - val_acc: 0.9082 Epoch 18/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0301 - acc: 0.9955 - val_loss: 0.3698 - val_acc: 0.9071 Epoch 19/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0291 - acc: 0.9953 - val_loss: 0.3719 - val_acc: 0.9086 Epoch 20/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0281 - acc: 0.9953 - val_loss: 0.3723 - val_acc: 0.9102 Epoch 21/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0257 - acc: 0.9959 - val_loss: 0.3764 - val_acc: 0.9084 Epoch 22/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0253 - acc: 0.9960 - val_loss: 0.3789 - val_acc: 0.9062 Epoch 23/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0217 - acc: 0.9966 - val_loss: 0.3754 - val_acc: 0.9077 Epoch 24/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0228 - acc: 0.9959 - val_loss: 0.3757 - val_acc: 0.9072 Epoch 25/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0206 - acc: 0.9968 - val_loss: 0.3773 - val_acc: 0.9072 Epoch 26/50 1537/1537 [==============================] - 457s 298ms/step - loss: 0.0193 - acc: 0.9970 - val_loss: 0.3771 - val_acc: 0.9082 Epoch 27/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0197 - acc: 0.9967 - val_loss: 0.3809 - val_acc: 0.9066 Epoch 28/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0191 - acc: 0.9968 - val_loss: 0.3806 - val_acc: 0.9048 Epoch 29/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0195 - acc: 0.9964 - val_loss: 0.3810 - val_acc: 0.9081 Epoch 30/50 1537/1537 [==============================] - 459s 299ms/step - loss: 0.0170 - acc: 0.9973 - val_loss: 0.3800 - val_acc: 0.9061 Epoch 31/50 1537/1537 [==============================] - 457s 298ms/step - loss: 0.0169 - acc: 0.9972 - val_loss: 0.3818 - val_acc: 0.9082 Epoch 32/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0166 - acc: 0.9970 - val_loss: 0.3829 - val_acc: 0.9089 Epoch 33/50 1537/1537 [==============================] - 456s 297ms/step - loss: 0.0146 - acc: 0.9974 - val_loss: 0.3849 - val_acc: 0.9084 Epoch 34/50 1537/1537 [==============================] - 456s 297ms/step - loss: 0.0143 - acc: 0.9975 - val_loss: 0.3924 - val_acc: 0.9077 Epoch 35/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0139 - acc: 0.9976 - val_loss: 0.3954 - val_acc: 0.9076 Epoch 36/50 1537/1537 [==============================] - 457s 298ms/step - loss: 0.0148 - acc: 0.9974 - val_loss: 0.3888 - val_acc: 0.9084 Epoch 37/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0136 - acc: 0.9973 - val_loss: 0.3913 - val_acc: 0.9099 Epoch 38/50 1537/1537 [==============================] - 459s 299ms/step - loss: 0.0136 - acc: 0.9976 - val_loss: 0.3916 - val_acc: 0.9077 Epoch 39/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0124 - acc: 0.9977 - val_loss: 0.3915 - val_acc: 0.9094 Epoch 40/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0119 - acc: 0.9979 - val_loss: 0.3936 - val_acc: 0.9095 Epoch 41/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0127 - acc: 0.9979 - val_loss: 0.3974 - val_acc: 0.9086 Epoch 42/50 1537/1537 [==============================] - 460s 299ms/step - loss: 0.0112 - acc: 0.9981 - val_loss: 0.3957 - val_acc: 0.9086 Epoch 43/50 1537/1537 [==============================] - 459s 298ms/step - loss: 0.0119 - acc: 0.9977 - val_loss: 0.3987 - val_acc: 0.9090 Epoch 44/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0119 - acc: 0.9976 - val_loss: 0.4017 - val_acc: 0.9089 Epoch 45/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0113 - acc: 0.9975 - val_loss: 0.3999 - val_acc: 0.9081 Epoch 46/50 1537/1537 [==============================] - 457s 297ms/step - loss: 0.0110 - acc: 0.9976 - val_loss: 0.3984 - val_acc: 0.9095 Epoch 47/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0104 - acc: 0.9982 - val_loss: 0.4017 - val_acc: 0.9087 Epoch 48/50 1537/1537 [==============================] - 459s 299ms/step - loss: 0.0110 - acc: 0.9977 - val_loss: 0.3951 - val_acc: 0.9112 Epoch 49/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0104 - acc: 0.9978 - val_loss: 0.4041 - val_acc: 0.9072 Epoch 50/50 1537/1537 [==============================] - 458s 298ms/step - loss: 0.0114 - acc: 0.9971 - val_loss: 0.4078 - val_acc: 0.9089
import matplotlib.pyplot as plt
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
epochs = range(len(acc))
plt.plot(epochs, acc, 'bo')
plt.plot(epochs, val_acc, 'b')
plt.title('Training and validation accuracy')
plt.figure()
plt.plot(epochs, loss, 'bo')
plt.plot(epochs, val_loss, 'b')
plt.title('Training and validation loss')
plt.show()
We can see that after 5 epochs we come close to the maximum validation accuracy, and we can see from the graph how quickly we overfit the data. We think using data augmantation is better beacuse it allows us to not overfit the data and if we would want to deploy a model to a production application its better to have a model that has seen more data instead of one that saw the same data a bunch of times
After reading this paper: https://openreview.net/forum?id=YicbFdNTTy
We decided to try a visual transformer approach.
Historically the best performing models for image classification has been deep convolutional networks live ResNet Xception ...
This paper proposes a different approach that does not rely on the convolution operator.
Traditionaly this apptoach was used for NLP applications.
The steps are: Split an image into patches
Flatten the patches
Produce lower-dimensional linear embeddings from the flattened patches
Add positional embeddings
Feed the sequence as an input to a standard transformer encoder
Pretrain the model with image labels (fully supervised on a huge dataset)
Finetune on the downstream dataset for image classification
The main advatage of this approach is lower computational requirements

from keras.preprocessing.image import ImageDataGenerator
img_width ,img_height= 224, 224
batch_size = 32
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
shear_range=0.1,
zoom_range=0.15
)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
TRAIN_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
total_train_image_count = train_generator.samples
class_count = train_generator.num_classes
validation_generator = test_datagen.flow_from_directory(
VALID_PATH,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical',
shuffle=False)
total_val_image_count = train_generator.samples
Found 24592 images belonging to 120 classes. Found 6058 images belonging to 120 classes.
We can already see here the advantages of ViT.
We are using a simiilar image data generator as before but this time we can use a batch size of 32
Before we could not do that as we did not have enough memory on the GPU
from vit_keras import vit
import tensorflow as tf
import tensorflow_addons as tfa
Downloading a pretrained ViT model (pretrained on imagenet)
vit_model = vit.vit_b32(
image_size = img_width,
activation = 'softmax',
pretrained = True,
include_top = False,
pretrained_top = False,
classes = 120)
Downloading data from https://github.com/faustomorales/vit-keras/releases/download/dl/ViT-B_32_imagenet21k+imagenet2012.npz 353255424/353253686 [==============================] - 322s 1us/step
We will use the ViT model as the first layer followed by a fully connected one with 256 nodes.
model = tf.keras.Sequential([
vit_model,
tf.keras.layers.Flatten(),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(256, activation = tfa.activations.gelu),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(120, 'softmax')
],
name = 'vision_transformer')
model.summary()
Model: "vision_transformer" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= vit-b32 (Functional) (None, 768) 87455232 _________________________________________________________________ flatten_2 (Flatten) (None, 768) 0 _________________________________________________________________ batch_normalization_9 (Batch (None, 768) 3072 _________________________________________________________________ dense_2 (Dense) (None, 256) 196864 _________________________________________________________________ batch_normalization_10 (Batc (None, 256) 1024 _________________________________________________________________ dense_3 (Dense) (None, 120) 30840 ================================================================= Total params: 87,687,032 Trainable params: 87,684,984 Non-trainable params: 2,048 _________________________________________________________________
Defining the learning rate then the optimizer.
We also defined a dynamic learning rate which basically means that if we reach a plateau the learning rate will be decreased and try to improve even more.
We defined early stopping which will stop the learing proccess if there was no improvement for 5 epochs.
And also checkpoint to save the bast weights so far
learning_rate = 1e-4
optimizer = tfa.optimizers.RectifiedAdam(learning_rate = learning_rate)
model.compile(optimizer = optimizer,
loss = tf.keras.losses.CategoricalCrossentropy(label_smoothing = 0.2),
metrics = ['accuracy'])
steps_per_epoch = train_generator.samples / batch_size
validation_steps = validation_generator.samples / batch_size
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor = 'val_accuracy',
factor = 0.2,
patience = 2,
verbose = 1,
min_delta = 1e-4,
min_lr = 1e-6,
mode = 'max')
earlystopping = tf.keras.callbacks.EarlyStopping(monitor = 'val_accuracy',
min_delta = 1e-4,
patience = 5,
mode = 'max',
restore_best_weights = True,
verbose = 1)
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath = './model.hdf5',
monitor = 'val_accuracy',
verbose = 1,
save_best_only = True,
mode = 'max')
callbacks = [earlystopping, reduce_lr, checkpointer]
model.fit_generator(train_generator,
steps_per_epoch = steps_per_epoch,
validation_data = validation_generator,
validation_steps = validation_steps,
epochs = 60,
callbacks = callbacks)
model.save('model.h5', save_weights_only = True)
Epoch 1/60 768/768 [==============================] - 417s 480ms/step - loss: 3.6880 - accuracy: 0.3406 - val_loss: 2.3107 - val_accuracy: 0.7291 Epoch 00001: val_accuracy improved from -inf to 0.72912, saving model to .\model.hdf5 Epoch 2/60 768/768 [==============================] - 367s 477ms/step - loss: 2.4837 - accuracy: 0.6536 - val_loss: 2.2107 - val_accuracy: 0.7605 Epoch 00002: val_accuracy improved from 0.72912 to 0.76048, saving model to .\model.hdf5 Epoch 3/60 768/768 [==============================] - 367s 477ms/step - loss: 2.2974 - accuracy: 0.7121 - val_loss: 2.1648 - val_accuracy: 0.7763 Epoch 00003: val_accuracy improved from 0.76048 to 0.77633, saving model to .\model.hdf5 Epoch 4/60 768/768 [==============================] - 367s 478ms/step - loss: 2.1920 - accuracy: 0.7451 - val_loss: 2.1458 - val_accuracy: 0.7707 Epoch 00004: val_accuracy did not improve from 0.77633 Epoch 5/60 768/768 [==============================] - 367s 477ms/step - loss: 2.0734 - accuracy: 0.7908 - val_loss: 2.0779 - val_accuracy: 0.8013 Epoch 00005: val_accuracy improved from 0.77633 to 0.80125, saving model to .\model.hdf5 Epoch 6/60 768/768 [==============================] - 366s 476ms/step - loss: 1.9894 - accuracy: 0.8223 - val_loss: 2.0171 - val_accuracy: 0.8186 Epoch 00006: val_accuracy improved from 0.80125 to 0.81859, saving model to .\model.hdf5 Epoch 7/60 768/768 [==============================] - 367s 477ms/step - loss: 1.9222 - accuracy: 0.8475 - val_loss: 1.9781 - val_accuracy: 0.8354 Epoch 00007: val_accuracy improved from 0.81859 to 0.83542, saving model to .\model.hdf5 Epoch 8/60 768/768 [==============================] - 367s 477ms/step - loss: 1.8578 - accuracy: 0.8752 - val_loss: 1.9776 - val_accuracy: 0.8437 Epoch 00008: val_accuracy improved from 0.83542 to 0.84368, saving model to .\model.hdf5 Epoch 9/60 768/768 [==============================] - 369s 479ms/step - loss: 1.8109 - accuracy: 0.8934 - val_loss: 1.9756 - val_accuracy: 0.8399 Epoch 00009: val_accuracy did not improve from 0.84368 Epoch 10/60 768/768 [==============================] - 368s 478ms/step - loss: 1.7788 - accuracy: 0.9041 - val_loss: 1.9453 - val_accuracy: 0.8534 Epoch 00010: val_accuracy improved from 0.84368 to 0.85342, saving model to .\model.hdf5 Epoch 11/60 768/768 [==============================] - 370s 481ms/step - loss: 1.7323 - accuracy: 0.9229 - val_loss: 1.9466 - val_accuracy: 0.8547 Epoch 00011: val_accuracy improved from 0.85342 to 0.85474, saving model to .\model.hdf5 Epoch 12/60 768/768 [==============================] - 367s 478ms/step - loss: 1.7160 - accuracy: 0.9282 - val_loss: 1.9587 - val_accuracy: 0.8473 Epoch 00012: val_accuracy did not improve from 0.85474 Epoch 13/60 768/768 [==============================] - 366s 476ms/step - loss: 1.6822 - accuracy: 0.9401 - val_loss: 1.9246 - val_accuracy: 0.8635 Epoch 00013: val_accuracy improved from 0.85474 to 0.86349, saving model to .\model.hdf5 Epoch 14/60 768/768 [==============================] - 366s 476ms/step - loss: 1.6576 - accuracy: 0.9484 - val_loss: 1.9461 - val_accuracy: 0.8537 Epoch 00014: val_accuracy did not improve from 0.86349 Epoch 15/60 768/768 [==============================] - 367s 478ms/step - loss: 1.6579 - accuracy: 0.9463 - val_loss: 1.9518 - val_accuracy: 0.8529 Epoch 00015: ReduceLROnPlateau reducing learning rate to 1.9999999494757503e-05. Epoch 00015: val_accuracy did not improve from 0.86349 Epoch 16/60 768/768 [==============================] - 366s 476ms/step - loss: 1.5816 - accuracy: 0.9750 - val_loss: 1.8751 - val_accuracy: 0.8788 Epoch 00016: val_accuracy improved from 0.86349 to 0.87884, saving model to .\model.hdf5 Epoch 17/60 768/768 [==============================] - 367s 477ms/step - loss: 1.5505 - accuracy: 0.9873 - val_loss: 1.8691 - val_accuracy: 0.8764 Epoch 00017: val_accuracy did not improve from 0.87884 Epoch 18/60 768/768 [==============================] - 366s 476ms/step - loss: 1.5419 - accuracy: 0.9876 - val_loss: 1.8631 - val_accuracy: 0.8785 Epoch 00018: ReduceLROnPlateau reducing learning rate to 3.999999898951501e-06. Epoch 00018: val_accuracy did not improve from 0.87884 Epoch 19/60 768/768 [==============================] - 367s 477ms/step - loss: 1.5334 - accuracy: 0.9910 - val_loss: 1.8505 - val_accuracy: 0.8833 Epoch 00019: val_accuracy improved from 0.87884 to 0.88329, saving model to .\model.hdf5 Epoch 20/60 768/768 [==============================] - 366s 476ms/step - loss: 1.5291 - accuracy: 0.9914 - val_loss: 1.8507 - val_accuracy: 0.8826 Epoch 00020: val_accuracy did not improve from 0.88329 Epoch 21/60 768/768 [==============================] - 369s 480ms/step - loss: 1.5264 - accuracy: 0.9934 - val_loss: 1.8453 - val_accuracy: 0.8835 Epoch 00021: val_accuracy improved from 0.88329 to 0.88346, saving model to .\model.hdf5 Epoch 22/60 768/768 [==============================] - 368s 479ms/step - loss: 1.5247 - accuracy: 0.9925 - val_loss: 1.8453 - val_accuracy: 0.8848 Epoch 00022: val_accuracy improved from 0.88346 to 0.88478, saving model to .\model.hdf5 Epoch 23/60 768/768 [==============================] - 366s 476ms/step - loss: 1.5219 - accuracy: 0.9934 - val_loss: 1.8461 - val_accuracy: 0.8828 Epoch 00023: val_accuracy did not improve from 0.88478 Epoch 24/60 768/768 [==============================] - 368s 478ms/step - loss: 1.5218 - accuracy: 0.9933 - val_loss: 1.8459 - val_accuracy: 0.8838 Epoch 00024: ReduceLROnPlateau reducing learning rate to 1e-06. Epoch 00024: val_accuracy did not improve from 0.88478 Epoch 25/60 768/768 [==============================] - 369s 479ms/step - loss: 1.5184 - accuracy: 0.9946 - val_loss: 1.8440 - val_accuracy: 0.8840 Epoch 00025: val_accuracy did not improve from 0.88478 Epoch 26/60 768/768 [==============================] - 367s 478ms/step - loss: 1.5194 - accuracy: 0.9940 - val_loss: 1.8439 - val_accuracy: 0.8836 Epoch 00026: val_accuracy did not improve from 0.88478 Epoch 27/60 768/768 [==============================] - 367s 477ms/step - loss: 1.5188 - accuracy: 0.9938 - val_loss: 1.8430 - val_accuracy: 0.8831 Restoring model weights from the end of the best epoch. Epoch 00027: val_accuracy did not improve from 0.88478 Epoch 00027: early stopping
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-85-e2e94f423df2> in <module> 41 callbacks = callbacks) 42 ---> 43 model.save('model.h5', save_weights_only = True) TypeError: save() got an unexpected keyword argument 'save_weights_only'
Y_pred = model.predict_generator(validation_generator, validation_steps)
y_pred = np.argmax(Y_pred, axis=1)
cm_vig = confusion_matrix(validation_generator.classes, y_pred)
print('Classification Report')
print(classification_report(validation_generator.classes, y_pred, target_names=labels.values()))
Classification Report
precision recall f1-score support
affenpinscher 0.98 0.91 0.94 45
afghan_hound 0.92 0.93 0.92 70
african_hunting_dog 0.96 0.98 0.97 50
airedale 0.93 0.90 0.92 61
american_staffordshire_terrier 0.69 0.74 0.71 47
appenzeller 0.92 0.78 0.84 45
australian_terrier 0.85 0.93 0.89 59
basenji 0.92 0.94 0.93 63
basset 0.82 0.96 0.88 51
beagle 0.89 0.83 0.86 59
bedlington_terrier 0.96 0.96 0.96 54
bernese_mountain_dog 0.90 0.97 0.93 66
black-and-tan_coonhound 0.92 0.94 0.93 47
blenheim_spaniel 0.95 0.96 0.96 57
bloodhound 0.90 0.83 0.87 54
bluetick 0.92 0.88 0.90 51
border_collie 0.78 0.91 0.84 44
border_terrier 0.92 0.88 0.90 52
borzoi 0.89 0.93 0.91 45
boston_bull 0.88 0.92 0.90 53
bouvier_des_flandres 0.93 0.87 0.90 47
boxer 0.88 0.84 0.86 45
brabancon_griffon 0.93 0.95 0.94 43
briard 0.84 0.84 0.84 43
brittany_spaniel 0.83 0.86 0.84 44
bull_mastiff 0.81 0.96 0.88 46
cairn 0.89 0.80 0.84 60
cardigan 0.89 0.85 0.87 46
chesapeake_bay_retriever 0.94 0.94 0.94 49
chihuahua 0.50 0.21 0.30 14
chow 0.93 0.93 0.93 57
clumber 0.98 0.98 0.98 45
cocker_spaniel 0.88 0.78 0.83 46
collie 0.85 0.70 0.77 47
curly-coated_retriever 0.93 0.89 0.91 44
dandie_dinmont 0.92 0.92 0.92 53
dhole 1.00 0.93 0.97 45
dingo 0.96 0.91 0.93 47
doberman 0.83 0.91 0.87 44
english_foxhound 0.75 0.75 0.75 48
english_setter 0.94 0.94 0.94 48
english_springer 0.87 0.89 0.88 46
entlebucher 0.94 0.92 0.93 63
eskimo_dog 0.74 0.65 0.69 43
flat-coated_retriever 0.82 0.84 0.83 44
french_bulldog 0.87 0.89 0.88 45
german_shepherd 0.93 0.89 0.91 44
german_short-haired_pointer 0.93 0.89 0.91 45
giant_schnauzer 0.73 0.89 0.80 45
golden_retriever 0.83 0.93 0.88 43
gordon_setter 0.95 0.91 0.93 46
great_dane 0.84 0.78 0.81 46
great_pyrenees 0.91 0.94 0.92 64
greater_swiss_mountain_dog 0.90 0.90 0.90 49
groenendael 0.92 0.98 0.95 46
ibizan_hound 0.95 0.95 0.95 55
irish_setter 1.00 0.85 0.92 48
irish_terrier 0.92 0.90 0.91 50
irish_water_spaniel 0.93 0.93 0.93 45
irish_wolfhound 0.90 0.90 0.90 63
italian_greyhound 0.82 0.87 0.85 54
japanese_spaniel 0.91 0.93 0.92 57
keeshond 0.96 0.98 0.97 47
kelpie 0.81 0.83 0.82 47
kerry_blue_terrier 0.92 0.87 0.89 52
komondor 0.93 0.98 0.96 44
kuvasz 0.93 0.93 0.93 44
labrador_retriever 0.89 0.82 0.85 50
lakeland_terrier 0.91 0.85 0.88 59
leonberg 0.95 0.95 0.95 63
lhasa 0.84 0.87 0.86 55
malamute 0.83 0.86 0.85 51
malinois 0.80 0.91 0.85 44
maltese_dog 0.86 0.85 0.86 73
mexican_hairless 0.98 0.91 0.94 46
miniature_pinscher 0.93 0.93 0.93 57
miniature_poodle 0.85 0.72 0.78 46
miniature_schnauzer 0.92 0.76 0.83 46
newfoundland 0.85 0.91 0.88 57
norfolk_terrier 0.83 0.80 0.82 50
norwegian_elkhound 0.98 0.97 0.97 58
norwich_terrier 0.78 0.87 0.82 52
old_english_sheepdog 0.92 0.94 0.93 51
otterhound 0.90 0.84 0.87 43
papillon 0.98 0.93 0.96 58
pekinese 0.86 0.82 0.84 44
pembroke 0.84 0.94 0.89 54
pomeranian 0.91 0.91 0.91 65
pug 0.92 0.84 0.88 58
redbone 0.91 0.98 0.94 43
rhodesian_ridgeback 0.84 0.82 0.83 51
rottweiler 0.85 0.98 0.91 45
saint_bernard 0.89 0.98 0.93 50
saluki 0.91 0.85 0.88 59
samoyed 0.95 0.88 0.91 65
schipperke 0.95 0.89 0.92 47
scotch_terrier 0.77 0.94 0.85 47
scottish_deerhound 0.95 0.85 0.90 71
sealyham_terrier 0.98 0.95 0.96 57
shetland_sheepdog 0.89 0.91 0.90 46
shih-tzu 0.85 0.88 0.86 65
siberian_husky 0.72 0.77 0.75 57
silky_terrier 0.89 0.89 0.89 54
soft-coated_wheaten_terrier 0.91 0.93 0.92 45
staffordshire_bullterrier 0.86 0.78 0.82 46
standard_poodle 0.82 0.79 0.80 47
standard_schnauzer 0.76 0.76 0.76 45
sussex_spaniel 0.91 0.96 0.93 45
tibetan_mastiff 0.91 0.93 0.92 44
tibetan_terrier 0.80 0.92 0.86 62
toy_poodle 0.89 0.74 0.81 46
toy_terrier 0.88 0.84 0.86 50
vizsla 0.91 0.95 0.93 44
walker_hound 0.81 0.77 0.79 44
weimaraner 0.91 0.90 0.91 48
welsh_springer_spaniel 0.90 0.82 0.86 45
west_highland_white_terrier 0.81 0.94 0.87 49
whippet 0.71 0.89 0.79 56
wire-haired_fox_terrier 0.94 0.98 0.96 47
yorkshire_terrier 0.88 0.78 0.83 49
accuracy 0.88 6058
macro avg 0.88 0.88 0.88 6058
weighted avg 0.89 0.88 0.88 6058
fig, ax = plt.subplots(figsize=(40,40)) # Sample figsize in inches
sns.set(font_scale=1.6)
sns.heatmap(cm_vig, annot=True, ax=ax)
plt.show()
We can see that even though here we performed a little bit worse compared to the Xception network (90% vs 88%) the training took much less time (7 hours vs 2.5 hours)
And I think we could tune the hyper parameters of the ViT model more and perhaps match the results of the Xception network
Results by model/algorithem
Naive Bayes - As predicted its a very poor model but it was helpfull as a test for the first try with 4% accuracy. 3 hours training + 3 hours loading the data to a df
Gabor+SVM - With gabor filters we got much better accuracy but still wasnt good enough with 24% accuracy. 3 hours training + 2 hours to prepare df
Gabor+Random Forest - We got 44% accuracy which is even better than the svm approach. 1 hour training + 2 hours to prepare df
VGG-16 + Random forest 100-400 estimators Using VGG-16 as fetature extractor and applying RF worked alot better than our previous models. (73%,75%,77%) The fastest approaches took about 2 hours to train including feature extraction
VGG16 + XGBoost - Another try with VGG16 and XGBoost performed worse then RFs with 72% accuracy. 15 hours
Classic CNN with data augmantation - We used a our own CNN which resulted with poor results at 34% accuracy. 7 hours
Xception with data augmantation This is a pretrained CNN worked much better with 91% accuracy. 7 hours
Classic CNN without data augmantation - Trained faster without augmantation but over fitted the data. 50% accuracy 7 hours
Xception without data augmantation Trained similarly as without aug. 90% accuracy 7 hours
ViT with data augmantation Trained much faster than exception got similar results. 88% accuracy 2.5 hours
Eran: